Using the cloud to scale up hyperparameter optimization for machine learning
Who is this presentation for?
- Data scientists
Hyperparameter optimization for machine leaning (ML) models is a complex task that combines multiple training sessions with searching through a high-dimensional space of parameters. Despite its complexity and ML algorithm dependency, hyperparameter optimization can be conceptualized as a meta task that can be decoupled from the training sessions and implemented as a generic framework that focuses on sampling the parameter search space while trying to optimize a user-defined metric. This approach allows hyperparameter optimization to be used as a generic process automation applicable to any machine learning algorithm, which frees you from time-consuming jobs like writing optimization code and managing repeatable data processing pipelines.
Fidan Boylu Uz, Mario Bourgoin, and George Iordanescu demonstrate how hyperparameter optimization can be performed in a transparent, scalable, and easy-to-manage way using Azure HyperDrive. You’ll explore object detection and text matching, two common ML scenarios for image computer vision and natural language processing (NLP) that were implemented using open source frameworks. For object detection, they used the Faster R-CNN algorithm, which is a state-of-the-art deep learning algorithm for that task. The algorithm has two implementations based on the TensorFlow Object Detection API and torchvision, two open source frameworks that are de facto standards for constructing and training object detection models. The text matching task includes text data featurization and is implemented via a scikit-learn pipeline.
All three implementations leverage Azure HyperDrive for hyperparameter optimization using the Azure Machine Learning (AzureML) Python SDK. The key points are decoupling AzureML dependencies via Conda environment files, constructing implementation-specific but platform-agnostic Docker files and corresponding Docker images for easy reproducibility, containerized data preprocessing using elastically allocated AzureML compute targets, and using AzureML HyperDrive for hyperparameter tuning while training the deep learning or NLP models. The end-to-end process is completed by showing how the tuned models are deployed at scale using Azure Kubernetes clusters using a Jupyter notebook widget client that consumes the models.
These steps provide an easily reproducible recipe for building custom object detection or NLP models. Via transfer learning, the two implementations for object detection can be extended to solve a large class of object detection problems by leveraging publicly available pretrained models further refined by training on smaller size custom datasets to solve customer-specific problems related to identifying multiple objects in images. For the text matching example, main steps like question selection, labeling, and data featurization are reusable for other similar NLP and generic ML problems. HyperDrive can be used in an autoscalable, generic, out-of-the-box fashion to automatically optimize the hyperparameter settings for heterogenous and completely independent groups of applications by employing advanced features of the hyperparameter optimization framework, including early termination policies and random, grid, or Bayesian optimization.
- General knowledge of machine learning, deep learning, and Python
What you'll learn
- Understand automating hyperparameter tuning for machine learning, object detection using PyTorch and TensorFlow, and text matching using scikit-learn
Fidan Boylu Uz
Fidan Boylu Uz is a senior data scientist at Microsoft, where she’s responsible for the successful delivery of end-to-end advanced analytic solutions. She’s also worked on a number of projects on predictive maintenance and fraud detection. Fidan has 10+ years of technical experience on data mining and business intelligence. Previously, she was a professor conducting research and teaching courses on data mining and business intelligence at the University of Connecticut. She has a number of academic publications on machine learning and optimization and their business applications and holds a PhD in decision sciences.
Mario Bourgoin is a senior data scientist at Microsoft, where he helps the company’s efforts to democratize AI, and a mathematician, data scientist, and statistician with a broad and deep knowledge of machine learning, artificial intelligence, data mining, statistics, and computational mathematics. Previously, he taught at several institutions and joined a Boston-area startup, where he worked on medical and business applications. He earned his PhD in mathematics from Brandeis University in Waltham, Massachusetts.
George Iordanescu is a data scientist on the algorithms and data science team for Microsoft’s Cortana Intelligence Suite. Previously, he was a research scientist in academia, a consultant in the healthcare and insurance industry, and a postdoctoral visiting fellow in computer-assisted detection at the National Institutes of Health (NIH). His research interests include semisupervised learning and anomaly detection. George holds a PhD in EE from Politehnica University in Bucharest, Romania.
Leave a Comment or Question
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
Premier Diamond Sponsors
Premier Exhibitor Plus
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
For media/analyst press inquires