Presented By O’Reilly and Intel AI
Put AI to work
8-9 Oct 2018: Training
9-11 Oct 2018: Tutorials & Conference
London, UK

Applied machine learning at Facebook: An infrastructure perspective

Yangqing Jia (Alibaba Group), Dmytro Dzhulgakov (Facebook)
11:05–11:45 Wednesday, 10 October 2018
Implementing AI
Location: King's Suite - Sandringham
Secondary topics:  Deep Learning tools, Edge computing and Hardware

What you'll learn

  • Explore Facebook's hardware and software infrastructure for machine learning at global scale

Description

Machine learning sits at the core of many essential products and services at Facebook. Yangqing Jia and Dmytro Dzhulgakov offer an overview of the hardware and software infrastructure that supports machine learning at global scale.

Facebook’s machine learning workloads are extremely diverse: services require many different types of models in practice. This diversity has implications at all layers in the system stack. In addition, a sizable fraction of all data stored at Facebook flows through machine learning pipelines, presenting significant challenges in delivering data to high-performance distributed training flows. Computational requirements are also intense, leveraging both GPU and CPU platforms for training and abundant CPU capacity for real-time inference. Addressing these and other emerging challenges continues to require diverse efforts that span machine learning algorithms, software, and hardware design.
Photo of Yangqing Jia

Yangqing Jia

Alibaba Group

Yangqing Jia leads Alibaba’s AI and Big Data org, supporting the large-scale applications both inside the company and on Aliyun, the number one cloud provider in China and a market leader globally. The org provides advanced AI systems and service combined with conventional big data wisdom (EMR, Flink, and Spark) as well as battle-tested solutions to serve every cloud client.

Photo of Dmytro Dzhulgakov

Dmytro Dzhulgakov

Facebook

Dmytro Dzhulgakov is an engineering manager and technical lead for AI infrastructure at Facebook, where he is currently leading the core development of PyTorch 1.0, an open source deep learning platform. Dmytro is one of the cocreators of ONNX, a joint initiative aimed at making AI development more interoperable. Previously, he built several generations of large-scale deep learning recommendation systems at Facebook that powered products from ads to the news feed. Dmytro holds an MS in applied mathematics. He had a successful career in programming competitions and was ranked in the Top 20 on Topcoder.