Presented By O’Reilly and Intel Nervana
Put AI to work
September 17-18, 2017: Training
September 18-20, 2017: Tutorials & Conference
San Francisco, CA

High-performance computing opportunities in deep learning

Greg Diamos (Baidu)
1:45pm–2:25pm Wednesday, September 20, 2017
Location: Imperial B
Average rating: ****.
(4.50, 2 ratings)

What you'll learn

  • Understand how high-performance computing can help address challenges to further improving deep learning performance


Deep learning has fueled significant progress in computer vision, speech recognition, and natural language processing. To take just one example, a single deep learning algorithm learn can recognize two vastly different languages, English and Mandarin, and begin to synthesize realistic human speech. However, it turns out that deep learning is compute limited, even on the fastest machines that we can build. Accuracy scales with data and compute, transforming some difficult AI problems into problems of computational scale. Baidu thinks that high-performance computing can help address these challenges.

Greg Diamos covers challenges to further improving performance and outlines a plan of attack for tearing down the remaining obstacles standing in the way of strong scaling deep learning to the largest machines in the world. Greg outlines the performance characteristics of Baidu’s deep learning workloads in detail, focusing on the recurrent neural networks used in the company’s conversational interfaces as a case study, and concludes by sharing open problems across the entire hardware and software stack, from electrons to AI frameworks, and suggesting directions for future work.

Photo of Greg Diamos

Greg Diamos


Greg Diamos leads computer systems research at Baidu’s Silicon Valley AI Lab (SVAIL), where he helped develop the Deep Speech and Deep Voice systems. Previously, Greg contributed to the design of compiler and microarchitecture technologies used in the Volta GPU at NVIDIA. Greg holds a PhD from the Georgia Institute of Technology, where he led the development of the GPU-Ocelot dynamic compiler, which targeted CPUs and GPUs from the same program representation.

Comments on this page are now closed.


08/24/2017 9:43am PDT

Any recommended papers/articles to read before attending this talk?