Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY

Building advanced analytics and deep learning on Apache Spark with BigDL

Yuhao Yang (Intel), Zhichao Li (Intel)
1:15pm1:55pm Wednesday, September 27, 2017
Artificial Intelligence, Machine Learning & Data Science
Location: 1A 12/14 Level: Intermediate
Secondary topics:  Deep learning
Average rating: ****.
(4.00, 2 ratings)

Who is this presentation for?

  • Data scientists and software engineers

Prerequisite knowledge

  • A basic understanding of Spark and deep learning concepts (e.g., backpropagation, LeNet, and RDD)

What you'll learn

  • Learn how to build distributed deep learning models on top of Spark and end-to-end analytics and deep learning applications
  • Understand how to visualize the training process with TensorBoard


The rapid development of deep learning in recent years has greatly changed the landscape of data analytics and machine learning and helped empower the success of many applications for artificial intelligence. BigDL, a new distributed deep learning framework on Apache Spark, provides easy and seamlessly integrated big data and deep learning capabilities for users.

Yuhao Yang and Zhichao Li share real-world examples of end-to-end analytics and deep learning applications, such as speech recognition (e.g., Deep Speech 2), object detection (e.g., Single Shot Multibox Detector), and recommendations, on top of BigDL and Spark, with a particular focus on how the users leveraged the BigDL models, feature transformers, and Spark ML to build complete analytics pipelines. Yuhao and Zhichao also explore recent developments in BigDL, including full support for Python APIs (built on top of PySpark), notebook and TensorBoard support, TensorFlow model R/W support, better recurrent and recursive net support, and 3D image convolutions.

Photo of Yuhao Yang

Yuhao Yang


Yuhao Yang is a senior software engineer on the big data team at Intel, where he focuses on deep learning algorithms and applications—particularly distributed deep learning and machine learning solutions for fraud detection, recommendation, speech recognition, and visual perception. He’s also an active contributor to Apache Spark MLlib.

Photo of Zhichao Li

Zhichao Li


Zhichao Li is a senior software engineer at Intel focused on distributed machine learning, especially large-scale analytical applications and infrastructure on Spark. He’s also an active contributor to Spark. Previously, Zhichao worked in Morgan Stanley’s FX Department.

Zhichao Li是英特尔的高级软件工程师,专注于分布式机器学习,尤其是Spark上的大规模分析应用和基础架构。他也是一名Spark项目的积极贡献者。 在加入英特尔之前,Zhichao曾在摩根士丹利的外汇部工作。