San FranciscoLondon New York

Presented By
O’Reilly + Cloudera

Make Data Work

March 25-28, 2019
San Francisco, CA

Please log in

Add to Your Schedule

Analytics Zoo: Distributed TensorFlow and Keras on Apache Spark

Jason Dai (Intel), Yuhao Yang (Intel), Jiao(Jennie) Wang (Intel), Guoqiong Song (Intel)

1:30pm–5:00pm Tuesday, March 26, 2019

Data Science, Machine Learning & AI
Location: 2009

Secondary topics: Deep Learning, Temporal data and time-series analytics

Average rating:

(3.00, 6 ratings)

Download slides (PDF)

Who is this presentation for?

Big data engineers, deep learning engineers, and data scientists

Level

Intermediate

Prerequisite knowledge

Familiarity with big data and machine learning

Materials or downloads needed in advance

A laptop
A GitHub account

What you'll learn

Explore emerging deep learning frameworks for big data
Learn practical design patterns for distributed systems and algorithms for these frameworks
Gain experience using innovative application pipelines and architecture for the new class of deep learning applications on big data platforms

Description

Analytics Zoo provides a unified analytics and AI platform that seamlessly unites Spark, TensorFlow, Keras, and BigDL programs into an integrated pipeline. The entire pipeline can then transparently scale out to a large Hadoop/Spark cluster for distributed training or inference.

Jason Dai, Yuhao Yang, Jennie Wang, and Guoqiong Song explain how to build and productionize deep learning applications for big data (transfer learning-based image classification, sequence-to-sequence prediction for precipitation nowcasting, neural collaborative filtering for recommendations, unsupervised time series anomaly detection, etc.) with Analytics Zoo, using real-world use cases from JD.com, MLSListings, the World Bank, Baosight, and Midea/KUKA.

Jason Dai

Intel

Jason Dai is a senior principal engineer and chief architect for big data technologies at Intel, where he leads the development of advanced big data analytics, including distributed machine learning and deep learning. Jason is an internationally recognized expert on big data, the cloud, and distributed machine learning; he’s the cochair of the Strata Data Conference in Beijing, a committer and PMC member of the Apache Spark project, and the creator of BigDL, a distributed deep learning framework on Apache Spark.

Website

Yuhao Yang

Intel

Yuhao Yang is a senior software engineer on the big data team at Intel, where he focuses on deep learning algorithms and applications—particularly distributed deep learning and machine learning solutions for fraud detection, recommendation, speech recognition, and visual perception. He’s also an active contributor to Apache Spark MLlib.

Website

Jiao(Jennie) Wang

Intel

Jiao (Jennie) Wang is a software engineer on the big data technology team at Intel, where she works in the area of big data analytics. She’s engaged in developing and optimizing distributed deep learning framework on Apache Spark.

Jiao（Jennie）Wang是英特尔大数据技术团队的软件工程师，主要工作在大数据分析领域。她致力于基于Apache Spark开发和优化分布式深度学习框架。

Guoqiong Song

Intel

Guoqiong Song is a senior deep learning software engineer on the big data technology team at Intel. She’s interested in developing and optimizing distributed deep learning algorithms on Spark. She holds a PhD in atmospheric and oceanic sciences with a focus on numerical modeling and optimization from UCLA.

Guoqiong Song是英特尔大数据技术团队的高级深度学习软件工程师。她拥有加州大学洛杉矶分校的大气和海洋科学博士学位，专业方向是数值建模和优化。她现在的研究兴趣是开发和优化分布式深度学习算法。

Website

Comments on this page are now closed.

Comments

Jiao(Jennie) Wang | SOFTWARE ENGINEER

04/05/2019 6:07am PDT

Analytics Zoo can support TF training/fine tune/inference on Spark. If you have TF face recognition model, you can try to use Analytics Zoo to run on spark.

James Wang | SOLUTION ARCHITECT

04/05/2019 5:36am PDT

I attended this tutorial. Can you send me a copy of your slides. My email is jcwang@us.ibm.com.
Thanks, James Wang

Craig Holley | SENIOR BIG DATA ARCHITECT

01/28/2019 10:57pm PST

So I am specifically interested in the integration of GPU/HPC tech underneath the Spark to speed up TF and DL tech in general. Specifically working on how to stream process for things like face recognition on the fly in spark, overlain by tensorflow using DL face recognition tech.

Presented by

Strategic Sponsors

Zettabyte Sponsor

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsor

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com