Sep 23–26, 2019

Learning with Limited Labeled Data

Shioulin Sam (Cloudera Fast Forward Labs)
1:15pm1:55pm Wednesday, September 25, 2019
Location: 1A 12/14
Secondary topics:  Deep Learning

Who is this presentation for?

Data Scientists, Machine Learning Engineers, Product Managers



Prerequisite knowledge

Basic math, basic understanding of classifiers and neural networks

What you'll learn

* Classical active learning strategies (engineered heuristics) to choose the "best" data to label * Active learning algorithms tailored for deep learning * An under-the-hood understanding of active learning * When to use active learning, and what to look out for


Being able to teach machines with examples is a powerful capability, but it hinges on the availability of vast amounts of data. The data not only needs to exist, but has to be in a form that allows relationships between input features and output to be uncovered. Creating labels for each input feature fulfills this requirement, but is an expensive undertaking.

Classical approaches to this problem rely on human and machine collaboration. In these approaches, engineered heuristics are used to smartly select “best” instances of data to label, in order to reduce cost. A human steps in to provide the label; the model then learns from this smaller labeled dataset. Recent advancements have made these approaches amenable to deep learning, enabling models to be built with limited labeled data.

In this talk, we explore algorithmic approaches that drive this capability, and provide practical guidance for translating this capability into production. We provide intuition for how and why these algorithms work through a live demo.

Photo of Shioulin Sam

Shioulin Sam

Cloudera Fast Forward Labs

Shioulin Sam is a research engineer at Cloudera Fast Forward Labs, where she bridges academic research in machine learning with industrial applications. In her previous life, she managed a portfolio of early-stage ventures focusing on women-led startups and public market investments. She also worked in the investment management industry designing quantitative trading strategies. She holds a Ph.D in Electrical Engineering and Computer Science from Massachusetts Institute of Technology.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

For conference registration information and customer service

For more information on community discounts and trade opportunities with O’Reilly conferences

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of Strata Data Conference contacts