Deep learning has fueled significant progress in computer vision, speech recognition, and natural language processing. To take just one example, a single deep learning algorithm learn can recognize two vastly different languages, English and Mandarin, and begin to synthesize realistic human speech. However, it turns out that deep learning is compute limited, even on the fastest machines that we can build. Accuracy scales with data and compute, transforming some difficult AI problems into problems of computational scale. Baidu thinks that high-performance computing can help address these challenges.
Greg Diamos covers challenges to further improving performance and outlines a plan of attack for tearing down the remaining obstacles standing in the way of strong scaling deep learning to the largest machines in the world. Greg outlines the performance characteristics of Baidu’s deep learning workloads in detail, focusing on the recurrent neural networks used in the company’s conversational interfaces as a case study, and concludes by sharing open problems across the entire hardware and software stack, from electrons to AI frameworks, and suggesting directions for future work.
Greg Diamos leads computer systems research at Baidu’s Silicon Valley AI Lab (SVAIL), where he helped develop the Deep Speech and Deep Voice systems. Previously, Greg contributed to the design of compiler and microarchitecture technologies used in the Volta GPU at NVIDIA. Greg holds a PhD from the Georgia Institute of Technology, where he led the development of the GPU-Ocelot dynamic compiler, which targeted CPUs and GPUs from the same program representation.
Comments on this page are now closed.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org