Since its breakthrough in the 2012 ImageNet Challenge, deep learning has become the de facto standard method in most computer vision problems. In the past few years, with much more complicated and deeper neural network architectures, deep learning algorithms have met and exceeded human-level performance in image recognition. Increasingly, computer vision applications are starting to apply deep learning technologies, and plenty of them are achieving great success. Nevertheless, training deep learning networks on a large dataset remains very challenging, and the sheer amount of computation needed can take months. The community desperately needs tools to help train deep learning networks on multiple servers with multiple GPUs.
Anusua Trivedi, Barbara Stortz, and Patrick Buehler offer an overview of the Microsoft Cognitive Toolkit (CNTK), which is native on both Windows and Linux and offers a flexible symbolic graph, a friendly Python API, and almost linear scalability across multi-GPU systems and multiple machines. CNTK was originally designed for speech-processing tasks, and it was released under a relatively restrictive license on Codeplex in April 2015. In February 2016, CNTK was moved to GitHub with a much friendlier MIT License. In November 2016, CNTK 2.0 was released, which now contains both C++ and Python APIs. The Cognitive Toolkit was key to Microsoft Research’s recent breakthrough in speech recognition by reaching human parity in conversational speech recognition. It has been extensively used internally at Microsoft for image, text, and speech data, with each area benefiting from the built-in scalability.
There are a large number of deep learning toolkits widely used in the vision community, including Caffe, Torch, Theano, TensorFlow, and MXNet. CNTK has unique advantages over these toolkits, especially in speed and scalability. Anusua, Barbara, and Patrick share a comparison between five well-known toolkits to demonstrate how CNTK achieves almost linear scalability via advanced algorithms such as 1-bit SGD and block-momentum SGD and explain in detail these algorithms. Along the way, they’ll discuss the most recent results of the speed comparison between TensorFlow and CNTK. This experiment was independently conducted by NVIDIA on its latest DGX-1 system, with CNTK showing more than a two-fold improvement over TensorFlow when eight GPUs were used to train ResNet 50.
Motivation (15 minutes)
Introduction to Cognitive Toolkit (15 minutes)
Basics (20 minutes)
Images (60 minutes)
Text (40 minutes)
Reinforcement learning (20 minutes)
Conclusion and Q&A (10 minutes)
Anusua Trivedi is a data scientist on Microsoft’s advanced data science and strategic initiatives team, where she works on developing advanced predictive analytics and deep learning models. Previously, Anusua was a data scientist at the Texas Advanced Computing Center (TACC), a supercomputer center, where she developed algorithms and methods for the supercomputer to explore, analyze, and visualize clinical and biological big data. Anusua is a frequent speaker at machine learning and big data conferences across the United States, including Supercomputing 2015 (SC15), PyData Seattle 2015, and MLconf Atlanta 2015. Anusua has also held positions with UT Austin and University of Utah.
Barbara Stortz is a principal software manager at Microsoft working on data science customer projects running on Microsoft Azure and Cortana Intelligence, including machine learning and deep learning technologies. Previously, Barbara was a senior vice president for SAP Labs LLC, a founding member of SAP HANA, and head of SAP’s EIM products and the SAP Healthcare platform.
Patrick Buehler is a principal data scientist in the Cloud AI Group at Microsoft. He has over 15 years of working experience in academic settings and with various external customers spanning a wide range of computer vision problems. He earned his PhD in computer vision from Oxford with Andrew Zisserman.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org