Put AI to work
June 26-27, 2017: Training
June 27-29, 2017: Tutorials & Conference
New York, NY

Scalable deep learning with the Microsoft Cognitive Toolkit

Anusua Trivedi (Microsoft), Barbara Stortz (Microsoft), Patrick Buehler (Microsoft)
1:30pm5:00pm Tuesday, June 27, 2017
Implementing AI
Location: Beekman Level: Intermediate
Secondary topics:  Deep Learning, Machine Learning
Average rating: **...
(2.67, 3 ratings)

Prerequisite Knowledge

  • Basic knowledge of machine learning

Materials or downloads needed in advance

  • A laptop
  • A GitHub account

What you'll learn

  • Explore the Microsoft Cognitive Toolkit, which is native on both Windows and Linux and offers a flexible symbolic graph, a friendly Python API, and almost linear scalability across multi-GPU systems and multiple machines


Since its breakthrough in the 2012 ImageNet Challenge, deep learning has become the de facto standard method in most computer vision problems. In the past few years, with much more complicated and deeper neural network architectures, deep learning algorithms have met and exceeded human-level performance in image recognition. Increasingly, computer vision applications are starting to apply deep learning technologies, and plenty of them are achieving great success. Nevertheless, training deep learning networks on a large dataset remains very challenging, and the sheer amount of computation needed can take months. The community desperately needs tools to help train deep learning networks on multiple servers with multiple GPUs.

Anusua Trivedi, Barbara Stortz, and Patrick Buehler offer an overview of the Microsoft Cognitive Toolkit (CNTK), which is native on both Windows and Linux and offers a flexible symbolic graph, a friendly Python API, and almost linear scalability across multi-GPU systems and multiple machines. CNTK was originally designed for speech-processing tasks, and it was released under a relatively restrictive license on Codeplex in April 2015. In February 2016, CNTK was moved to GitHub with a much friendlier MIT License. In November 2016, CNTK 2.0 was released, which now contains both C++ and Python APIs. The Cognitive Toolkit was key to Microsoft Research’s recent breakthrough in speech recognition by reaching human parity in conversational speech recognition. It has been extensively used internally at Microsoft for image, text, and speech data, with each area benefiting from the built-in scalability.

There are a large number of deep learning toolkits widely used in the vision community, including Caffe, Torch, Theano, TensorFlow, and MXNet. CNTK has unique advantages over these toolkits, especially in speed and scalability. Anusua, Barbara, and Patrick share a comparison between five well-known toolkits to demonstrate how CNTK achieves almost linear scalability via advanced algorithms such as 1-bit SGD and block-momentum SGD and explain in detail these algorithms. Along the way, they’ll discuss the most recent results of the speed comparison between TensorFlow and CNTK. This experiment was independently conducted by NVIDIA on its latest DGX-1 system, with CNTK showing more than a two-fold improvement over TensorFlow when eight GPUs were used to train ResNet 50.


Motivation (15 minutes)

  • The growth of deep learning and AI

Introduction to Cognitive Toolkit (15 minutes)

  • Cognitive Toolkit architecture
  • Flexibility
  • Scalability
  • Windows and Linux are first-class citizens
  • Support for C++ and Python for both training and evaluation
  • RNN advantage due to dynamic axis

Basics (20 minutes)

  • Tensor operations
  • Data readers and built-in data augmentation
  • Image classification

Images (60 minutes)

  • Inception and Resnet
  • Fast and faster R-CNN
  • Emotion recognition
  • Neural style

Text (40 minutes)

  • LSTM
  • char-rnn

Reinforcement learning (20 minutes)

  • Atari game

Conclusion and Q&A (10 minutes)

Photo of Anusua Trivedi

Anusua Trivedi


Anusua Trivedi is a data scientist on Microsoft’s advanced data science and strategic initiatives team, where she works on developing advanced predictive analytics and deep learning models. Previously, Anusua was a data scientist at the Texas Advanced Computing Center (TACC), a supercomputer center, where she developed algorithms and methods for the supercomputer to explore, analyze, and visualize clinical and biological big data. Anusua is a frequent speaker at machine learning and big data conferences across the United States, including Supercomputing 2015 (SC15), PyData Seattle 2015, and MLconf Atlanta 2015. Anusua has also held positions with UT Austin and University of Utah.

Photo of Barbara Stortz

Barbara Stortz


Barbara Stortz is a principal software manager at Microsoft working on data science customer projects running on Microsoft Azure and Cortana Intelligence, including machine learning and deep learning technologies. Previously, Barbara was a senior vice president for SAP Labs LLC, a founding member of SAP HANA, and head of SAP’s EIM products and the SAP Healthcare platform.

Photo of Patrick Buehler

Patrick Buehler


Patrick Buehler is a principal data scientist in the Cloud AI Group at Microsoft. He has over 15 years of working experience in academic settings and with various external customers spanning a wide range of computer vision problems. He earned his PhD in computer vision from Oxford with Andrew Zisserman.