Analytics Zoo: Distributed TensorFlow and Keras on Apache Spark
Who is this presentation for?
- Big data engineers, deep learning engineers, and data scientists
Analytics Zoo provides a unified analytics and AI platform that seamlessly unites Spark, TensorFlow, Keras, and BigDL programs into an integrated pipeline. The entire pipeline can then transparently scale out to a large Hadoop/Spark cluster for distributed training or inference.
Jason Dai, Yuhao Yang, Jennie Wang, and Guoqiong Song explain how to build and productionize deep learning applications for big data (transfer learning-based image classification, sequence-to-sequence prediction for precipitation nowcasting, neural collaborative filtering for recommendations, unsupervised time series anomaly detection, etc.) with Analytics Zoo, using real-world use cases from JD.com, MLSListings, the World Bank, Baosight, and Midea/KUKA.
- Familiarity with big data and machine learning
Materials or downloads needed in advance
- A laptop
- A GitHub account
What you'll learn
- Explore emerging deep learning frameworks for big data
- Learn practical design patterns for distributed systems and algorithms for these frameworks
- Gain experience using innovative application pipelines and architecture for the new class of deep learning applications on big data platforms
Jason Dai is a senior principal engineer and chief architect for big data technologies at Intel, where he leads the development of advanced big data analytics, including distributed machine learning and deep learning. Jason is an internationally recognized expert on big data, the cloud, and distributed machine learning; he is the cochair of the Strata Data Conference in Beijing, a committer and PMC member of the Apache Spark project, and the creator of BigDL, a distributed deep learning framework on Apache Spark.
Yuhao Yang is a senior software engineer on the big data team at Intel, where he focuses on deep learning algorithms and applications—particularly distributed deep learning and machine learning solutions for fraud detection, recommendation, speech recognition, and visual perception. He’s also an active contributor to Apache Spark MLlib.
Jiao (Jennie) Wang is a software engineer on the big data technology team at Intel, where she works in the area of big data analytics. She’s engaged in developing and optimizing distributed deep learning framework on Apache Spark.
Guoqiong Song is a senior deep learning software engineer on the big data technology team at Intel. She’s interested in developing and optimizing distributed deep learning algorithms on Spark. She holds a PhD in atmospheric and oceanic sciences with a focus on numerical modeling and optimization from UCLA.
Guoqiong Song是英特尔大数据技术团队的高级深度学习软件工程师。 她拥有加州大学洛杉矶分校的大气和海洋科学博士学位，专业方向是数值建模和优化。 她现在的研究兴趣是开发和优化分布式深度学习算法。
Leave a Comment or Question
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
Diversity and Inclusion Sponsor
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
View a complete list of O'Reilly AI contacts