Mar 15–18, 2020

Optimized image classification on the cheap

11:50am12:30pm Wednesday, March 18, 2020
Location: 210 E

Who is this presentation for?

  • ML engineers and junior data scientists




Meghana Ravikumar anchors on an image classification use case to evaluate two approaches to transfer learning—fine tuning and feature extraction—and the impact of hyperparameter optimization on these techniques. The goal is to draw on a rigorous set of experimental results that can help you answer the question: how can resource-constrained teams make trade-offs between efficiency and effectiveness using pre-trained models?

This experiment uses the Stanford Cars dataset to compare the effects of fine-tuning a shallow net (ResNet 18) versus using a deeper network (ResNet 50) as a feature extractor on image classification accuracy. To both maximize model performance on a budget and explore the impact of optimization on these methods, Meghana uses SigOpt’s implementation of multitask optimization for hyperparameter tuning and parallelizes these tuning jobs using SigOpt Orchestrate with Nvidia K80 GPUs. You’ll explore the extent to which these methods improve performance, at what cost, and with what impact on wall-clock time.

In the case of both architectures, there was a significant lift in the performance of the model when multitask optimization was used to optimize the hyperparameters compared to the baseline. Fine-tuning ResNet 18 and optimizing the hyperparameters with multitask optimization produces the highest performing model (87.33% accurate); that is 3.92% more accurate than the next-best version of the model (ResNet 18 without optimization). The optimized and fine-tuned ResNet 18 represents a 40.92% reduction in error compared to the nonoptimized ResNet 50.

Now that there’s a model architecture and tuning technique in place, you’ll try to beat the model performance (87.33% accurate) by augmenting your data. Once more, Meghana leverages SigOpt multitask optimization and Nvidia K80 GPUs to efficiently tune hyperparameters and train the model. In this scenario, she adds image augmentation as a preprocessing step with the classification task downstream and multitask optimization informing parameters for both steps. Essentially, she creates a feedback loop where hyperparameters for the image augmentation preprocessing is informed by the downstream classification task performance.

In the context of the Stanford Cars dataset and model architecture at hand, you’ll find that the combination of image augmentation paired with multitask optimization gives a 6.65% boost in accuracy. Most interesting, the black box augmentation transforms the original image to a hypervibrant image. The transformations accentuate boundaries between objects and make the color schemes less complicated than the ones found in the original image. This may allow the model to learn the edge patterns of cars and deal with background clutter, illumination, and intraclass variance better. This leads to the questions: Why are these transformations important? What features are the model learning from these images? What problems do these transformations solve? How would this model perform with more test data? Would these transformations hold for a deeper ResNet model or a different architecture?

Join this talk to learn the results from this experiment, discuss generalizability of any of these results, and learn techniques for evaluating the trade-offs between transfer learning and full network optimization techniques.

Prerequisite knowledge

  • A basic understanding of ML and computer vision

What you'll learn

  • Learn how to use optimization, typically reserved for the end of the modeling process, throughout to make your models better
Photo of Meghana Ravikumar

Meghana Ravikumar


Meghana Ravikumar is a machine learning engineer at SigOpt with a particular focus on novel applications of deep learning across academia and industry. In particular, Meghana explores the impact of hyperparameter optimization and other techniques on model performance and evangelizes these practical lessons for the broader machine learning community. Previously, she was in biotech, employing natural language processing to mine and classify biomedical literature. She holds a BS degree in bioengineering from UC Berkeley. When she’s not reading papers, developing models and tools, or trying to explain complicated topics, she enjoys doing yoga, traveling, and hunting for the perfect chai latte.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

For conference registration information and customer service

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

For media/analyst press inquires