Presented By O’Reilly and Intel AI
Put AI to work
8-9 Oct 2018: Training
9-11 Oct 2018: Tutorials & Conference
London, UK

Building end-to-end computer vision solutions from pretrained deep learning models

Vanja Paunic (Microsoft), Patrick Buehler (Microsoft)
16:00–16:40 Thursday, 11 October 2018
Implementing AI, Models and Methods
Location: Windsor Suite
Secondary topics:  Computer Vision, Deep Learning models, Deep Learning tools
Average rating: **...
(2.00, 2 ratings)

Who is this presentation for?

  • Data scientists, AI developers, software engineers, and machine learning developers

Prerequisite knowledge

  • A basic understanding of machine learning and deep neural networks

What you'll learn

  • Learn how to use pretrained DNN models for computer vision, as well as entirely new use cases, through a process called transfer learning
  • Explore Microsoft's DNN technologies for building computer vision solutions


In recent years, dramatic progress has been made in the field of computer vision using deep neural network (DNN) technology. DNN models can now be trained on tens of millions of images to reliably recognize thousands of different classes of images. Microsoft has been a leading force in the advancement of this technology, with its development of the ResNet modeling technique, which won the 2015 ImageNet Object Detection competition.

While these state-of-the-art research results are impressive, an even more valuable aspects of these DNN models is the ease in which they can be adapted to new use cases without requiring extensive, computational-heavy retraining. Vanja Paunic and Patrick Buehler offer an overview of Microsoft’s DNN technologies for computer vision, describing both how the technology works and how Microsoft is making this technology available for outside users to build their own custom computer vision solutions. Vanja and Patrick specifically focus on the Microsoft Cognitive Toolkit, Custom Vision Service, and Azure Machine Learning Package for Computer Vision.

The Microsoft Cognitive Toolkit is a free, easy-to-use, commercial-grade open source toolkit for training deep learning algorithms. It provides the ability for users to not only train image processing models from scratch but also adapt existing pretrained state-of-the-art models to new use cases using their own data. Using this approach, high-quality models can be created using only fractions of the amount of data used to train large-scale ImageNet models.

Azure Machine Learning Package for Computer Vision aims to simplify the end-to-end experience of building highly accurate custom DNN models for computer vision. It offers Python APIs to compose custom pipelines with capabilities for dataset creation, dataset augmentation, transfer learning, and fine-tuning pretrained DNN models for classification, object detection, and image similarity, evaluating trained models against baselines, and deploying trained models to Azure. Vanja and Patrick demonstrate how the AML Package for Computer Vision can be used to quickly build a custom computer vision solution for an image classification scenario and deploy it on Azure.

To simplify the process of building custom computer vision models even further, Microsoft has developed Custom Vision Service to provide users with a simple web service and REST API that enable users to upload image datasets, define classes and labels, and annotate data. The Custom Vision Service then automatically builds a new deep learning model and deploys the model with a web service that can automatically annotate new images based on the user-defined annotation schema.

Photo of Vanja Paunic

Vanja Paunic


Vanja Paunic is a data scientist in the Algorithms and Data Science Group at Microsoft London. She works on building machine learning solutions with external companies utilizing Microsoft’s AI Cloud Platform. She holds a PhD in computer science with a focus on data mining in the biomedical domain from the University of Minnesota.

Photo of Patrick Buehler

Patrick Buehler


Patrick Buehler is a principal data scientist in the Cloud AI Group at Microsoft. He has over 15 years of working experience in academic settings and with various external customers spanning a wide range of computer vision problems. He earned his PhD in computer vision from Oxford with Andrew Zisserman.