Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Distributed Training of Deep Learning Models

Mathew Salvaris (Microsoft), Miguel Gonzalez-Fierro (Microsoft), Ilia Karmanov (Microsoft)
11:1511:55 Wednesday, 23 May 2018
Data science and machine learning
Location: Capital Suite 13 Level: Advanced

Who is this presentation for?

Data Scientist, AI Researcher

Prerequisite knowledge

Deep Learning, Docker and Python

What you'll learn

1) Comparison of the distributed training performance of each framework on available cloud hardware 2) Tips and pitfalls of distributed training with CNTK, Tensorflow (Horovod), Pytorch, MxNet and Chainer 3) Templates for Batch AI and Deep Learning Workshop that can be leveraged for own projects

Description

In the last year there have been a number of attempts to train deep CNNs on the ImageNet dataset in the shortest time possible, with the most recent attempt managing to do it in 15 minutes. All of these attempts happen on custom clusters which are out of the reach of most data scientists.
One of the key advantages of the cloud is being able to scale out compute resources as required. In this talk we will present two platforms for running distributed deep learning in the cloud which are within the reach of every data scientist. The first is a service called Batch AI which uses the Azure Batch infrastructure to easily run Deep Learning jobs at scale across GPUs. The second is an open source toolkit that allows data scientists to spin up clusters in turn-key fashion. It utilises Kubernetes and Grafana for easy job scheduling and monitoring. It has been used in daily production for Microsoft internal groups. Both utilise Docker containers making it possible to run any deep learning framework on them.
We will use the aforementioned training platforms to train a ResNet network on ImageNet dataset using each of the following frameworks: CNTK, Tensorflow (Horovod), PyTorch, MxNet and Chainer. We will then compare and contrast the performance improvement as we scale the number of nodes as well as provides tips and details of the pitfalls of each framework and platform. The examples presented can also be used as templates so that others can utilise these for their own deep learning problems.

Photo of Mathew Salvaris

Mathew Salvaris

Microsoft

Mathew Salvaris is a data scientist at Microsoft. Previously, Mathew was a data scientist for a small startup that provided analytics for fund managers and a postdoctoral researcher at UCL’s Institute of Cognitive Neuroscience, where he worked with Patrick Haggard in the area of volition and free will, devising models to decode human decisions in real time from the motor cortex using electroencephalography (EEG), and a postdoctoral position at the University of Essex’s Brain Computer Interface group, where he worked on BCIs for computer mouse control. Mathew holds a PhD in brain computer interfaces and an MSc in distributed artificial intelligence.

Photo of Miguel Gonzalez-Fierro

Miguel Gonzalez-Fierro

Microsoft

Miguel González-Fierro is a Senior Data Scientist at Microsoft UK, where his job consists of helping customers leverage their processes using Big Data and Machine Learning. Previously, he was CEO and founder of Samsamia Technologies, a company that created a visual search engine for fashion items allowing users to find products using images instead of words, and founder of the Robotics Society of Universidad Carlos III, which developed different projects related to UAVs, mobile robots, small humanoids competitions, and 3D printers. Miguel also worked as a robotics scientist at Universidad Carlos III of Madrid and King’s College London, where his research focused on learning from demonstration, reinforcement learning, computer vision, and dynamic control of humanoid robots. He holds a BSc and MSc in electrical engineering and an MSc and PhD in robotics.

Photo of Ilia Karmanov

Ilia Karmanov

Microsoft

Ilia is a Data Scientist working on applying ML and deep-learning solutions in industry. He is particularly interested in statistical theory behind deep-learning. Ilia holds a MSc in Economics from London School of Economics.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)