Presented By O'Reilly and Cloudera
Make Data Work
22–23 May 2017: Training
23–25 May 2017: Tutorials & Conference
London, UK

Distributed deep learning at scale on Apache Spark with BigDL

Ding Ding (Intel)
14:0014:30 Tuesday, 23 May 2017
Hardcore Data Science, Spark & beyond
Location: London Suite 2/3
Secondary topics:  Deep learning
Level: Intermediate

Built on Apache Spark, BigDL brings native support for deep learning functionalities to Spark, provides orders-of-magnitude speed-up over out-of-box open source DL frameworks like Caffe, Torch, and TensorFlow with regard to single node performance (by leveraging Intel MKL), and efficiently scales out deep learning workloads based on the Spark architecture. Ding Ding explains how BigDL enables more accessible deep learning for big data users and data scientists. Ding explores how users have adopted BigDL for deep learning analysis on large amounts of data in a distributed fashion (in applications including image recognition, object detection, and NLP), which allows them to use their big data platform (e.g., Apache Hadoop and Spark) as a unified data analytics platform for data storage, data processing and mining, feature engineering, traditional (non-deep) machine learning, and deep learning workloads.

Photo of Ding Ding

Ding Ding


Ding Ding is a software engineer on Intel’s big data technology team, where she works on developing and optimizing distributed machine learning and deep learning algorithms on Apache Spark, focusing particularly on large-scale analytical applications and infrastructure on Spark.