Analytics Zoo: Distributed TensorFlow and Keras on Apache Spark





Who is this presentation for?
- Big data engineers, deep learning engineers, and data scientists
Level
IntermediateDescription
Analytics Zoo provides a unified analytics and AI platform that seamlessly unites Spark, TensorFlow, Keras, and BigDL programs into an integrated pipeline. The entire pipeline can then transparently scale out to a large Hadoop and Spark cluster for distributed training or inference.
Jason Dai, Yuhao Yang, Jennie Wang, and Guoqiong Song explain how to build and productionize deep learning applications for big data (transfer learning-based image classification, sequence-to-sequence prediction for precipitation nowcasting, neural collaborative filtering for recommendations, unsupervised time series anomaly detection, etc.) with Analytics Zoo, using real-world use cases from JD.com, MLS Listings, the World Bank, Baosight, and Midea/KUKA.
Prerequisite knowledge
- Familiarity with big data and machine learning
Materials or downloads needed in advance
- A laptop with a GitHub account
What you'll learn
- Explore emerging deep learning frameworks for big data
- Learn practical design patterns for distributed systems and algorithms for these frameworks
- Gain experience using innovative application pipelines and architecture for the new class of deep learning applications on big data platforms

Jason Dai
Intel
Jason Dai is a senior principal engineer and chief architect for big data technologies at Intel, where he leads the development of advanced big data analytics, including distributed machine learning and deep learning. Jason is an internationally recognized expert on big data, the cloud, and distributed machine learning; he’s the cochair of the Strata Data Conference in Beijing, a committer and PMC member of the Apache Spark project, and the creator of BigDL, a distributed deep learning framework on Apache Spark.

Yuhao Yang
Intel
Yuhao Yang is a senior software engineer on the big data team at Intel, where he focuses on deep learning algorithms and applications—particularly distributed deep learning and machine learning solutions for fraud detection, recommendation, speech recognition, and visual perception. He’s also an active contributor to Apache Spark MLlib.

Jiao(Jennie) Wang
Intel
Jiao (Jennie) Wang is a software engineer on the big data technology team at Intel, where she works in the area of big data analytics. She’s engaged in developing and optimizing distributed deep learning framework on Apache Spark.
Jiao(Jennie)Wang是英特尔大数据技术团队的软件工程师,主要工作在大数据分析领域。她致力于基于Apache Spark开发和优化分布式深度学习框架。

Guoqiong Song
Intel
Guoqiong Song is a senior deep learning software engineer on the big data technology team at Intel. She’s interested in developing and optimizing distributed deep learning algorithms on Spark. She holds a PhD in atmospheric and oceanic sciences with a focus on numerical modeling and optimization from UCLA.
Guoqiong Song是英特尔大数据技术团队的高级深度学习软件工程师。 她拥有加州大学洛杉矶分校的大气和海洋科学博士学位,专业方向是数值建模和优化。 她现在的研究兴趣是开发和优化分布式深度学习算法。
Comments on this page are now closed.
Presented by
Elite Sponsors
Strategic Sponsors
Diversity and Inclusion Sponsor
Impact Sponsors
Premier Exhibitor Plus
R & D and Innovation Track Sponsor
Contact us
confreg@oreilly.com
For conference registration information and customer service
partners@oreilly.com
For more information on community discounts and trade opportunities with O’Reilly conferences
Become a sponsor
For information on exhibiting or sponsoring a conference
pr@oreilly.com
For media/analyst press inquires
Comments
Analytics Zoo Documentation: https://analytics-zoo.github.io/master/
Analytics Zoo github:
https://github.com/intel-analytics/analytics-zoo
tutorial github:
https://github.com/intel-analytics/OreillyAI2019