Deep learning on Apache Spark at CERN’s Large Hadron Collider with Analytics Zoo
Who is this presentation for?
- Big data analytics and AI architects, data engineers, data scientists, enterprise analytics, and AI decision makers
Sajan Govindan dives into how CERN applied end-to-end deep learning and analytics pipelines on Apache Spark at scale for high energy physics using BigDL and Analytics Zoo open source software running on Intel Xeon-based distributed clusters. Sajan outlines technical details and development insights with an example of topology classification to improve real-time event selection at the Large Hadron Collider (LHC). The classifier demonstrated very good performance figures for efficiency while also reducing the false-positive rate compared to existing methods. It could be used as a filter to improve the online event selection infrastructure of the LHC experiments, where it could benefit from a more flexible and inclusive selection strategy while reducing the amount of downstream resources wasted in processing false positives.
This is part of CERN’s research on applying deep learning and analytics using open source and industry-standard technologies as an alternative to the existing customized rule-based methods. Sajan explores how CERN could quickly build and implement distributed deep learning solutions and data pipelines at scale on Apache Spark using Analytics Zoo and BigDL, which are open source frameworks unifying analytics and AI on Spark with easy-to-use APIs and development interfaces seamlessly integrated with big data platforms.
- A basic understanding of Apache Spark and deep learning concepts
What you'll learn
- Discover how to simplify development and deployment of deep learning solutions on big data platforms at scale using open source technologies and how scientific computing applies industry-standard deep learning solutions in their data pipelines
- Learn about the deep learning frameworks BigDL and Analytics Zoo
Sajan Govindan is a solutions architect on the data analytics technologies team at Intel, focusing on open source technologies for big data analytics and AI solutions. Sajan has been with Intel for more than eighteen years, with many years of experience and expertise in building analytics and AI solutions, working through the advancements in the Hadoop and Spark ecosystem and machine learning and deep learning frameworks in various industry verticals and domains.
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
For media/analyst press inquires