The rapid growth of deep learning in demanding large-scale real-world applications has led to a rapid increase in demand for high-performance training and inference solutions. This demand is reflected in the growth of investment in deep learning performance by major hardware manufacturers, including a proliferation of new application-specific accelerators.
But performance isn’t driven by hardware alone. In the software realm, a new class of deep learning compilers has emerged, which brings to bear both classic and novel compiler techniques in order to maximize the performance of deep learning systems. Recently developed deep learning compilers include NNVM/TVM from the University of Washington and Amazon, Glow from Facebook, XLA from Google, and nGraph from Intel. These deep learning compilers unlock a wealth of optimizations that take a view of the whole data-flow graph. This approach achieves substantial speedups over the approach favored by existing frameworks, where an interpreter orchestrates the invocation of per-op compute kernels that must be optimized specifically for the framework and hardware target.
Adam Straw, Adam Procter, and Robert Earhart offer a comprehensive overview of Intel’s nGraph deep learning compiler.
Adam Straw is a Deep Learning Software Engineer in the Artificial Intelligence Products Group at Intel Corporation. Adam received a B.S. in Computer Engineering from Iowa State University and is currently working on a M.S. in Computer Science with a specialty in machine learning at Georgia Institute of Technology. Adam work on Intel® nGraph™ deep learning compiler with special focus on core design including current responsibilities for the nGraph quantization scheme.
Adam Procter is a deep learning software engineer in the Artificial Intelligence Products Group at Intel, where he works on the core design of the Intel nGraph deep learning compiler. He holds a PhD in computer science from the University of Missouri, where his research focused on programming language semantics, high-assurance computing, and techniques for compiling functional programming languages to reconfigurable hardware.
Rob Earhart is a deep learning software engineer in the Artificial Intelligence Products Group at Intel, where he works on PlaidML, an open source polyhedral tensor compiler that makes it pretty easy to run neural networks with good performance on a wide variety of hardware. Prior to Intel (and prior to diving into machine learning systems implementation), Rob worked on the NT kernel, was one of the original Hyper-V hypervisor engineers, and cofounded the virtual machine monitor that grew up to power Google Compute Engine.
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org