Presented By O'Reilly and Cloudera
Make Data Work
Dec 4–5, 2017: Training
Dec 5–7, 2017: Tutorials & Conference

Training and scoring deep neural networks in the cloud

Wee Hyong Tok (Microsoft), Danielle Dean (iRobot)
11:15am11:55am Thursday, December 7, 2017

Who is this presentation for?

  • Data scientists

Prerequisite knowledge

  • A basic understanding of deep learning and the data science process

What you'll learn

  • Understand how to approach deep learning projects and leverage cloud computing to efficiently train and score deep neural networks


Deep neural networks are responsible for many advances in natural language processing, computer vision, speech recognition, and even forecasting. However, these networks usually require vast amounts of data and are computationally expensive to train. Even if you obtain the hardware to enable the efficient training of deep neural networks, it can still take quite a long time to do so—especially since you also often have to tune the model and try different network architectures. As a result, it’s important to have an environment for building deep learning solutions in which one can explicitly consider trade-offs in this process between aspects such as training time versus cost. The environment should also allow you to conduct exploratory analyses, schedule programmatic training, enable real-time scoring, and enable batch scoring.

Danielle Dean and Wee Hyong Tok illustrate how cloud computing has been leveraged to complete several deep learning projects for the scenarios outlined above, using the Azure Data Science Virtual Machine with Deep Learning Toolkit, Azure Batch Shipyard, Spark on HDInsight, and even deep learning within SQL Server. Along the way, Danielle and Wee Hyong share several practical tips for approaching deep learning projects in several industry settings, including healthcare, manufacturing, and utilities, based on their real-world experience.

Although several of the technologies covered are specific to Microsoft’s Azure cloud computing platform, the different stages and approaches to a deep learning project are applicable across a range of technologies, many of them open source, such as Microsoft Cognitive Toolkit (previously known as CNTK) for deep learning.

Photo of Wee Hyong Tok

Wee Hyong Tok


Wee Hyong Tok is a principal data science manager with the AI CTO Office at Microsoft, where he leads the engineering and data science team for the AI for Earth program. Wee Hyong has worn many hats in his career, including developer, program and product manager, data scientist, researcher, and strategist, and his track record of leading successful engineering and data science teams has given him unique superpowers to be a trusted AI advisor to customers. Wee Hyong coauthored several books on artificial intelligence, including Predictive Analytics Using Azure Machine Learning and Doing Data Science with SQL Server. Wee Hyong holds a PhD in computer science from the National University of Singapore.

Photo of Danielle Dean

Danielle Dean


Danielle Dean is the technical director of machine learning at iRobot. Previously, she was a principal data science lead at Microsoft. She holds a PhD in quantitative psychology from the University of North Carolina at Chapel Hill.