Being able to teach machines with examples is a powerful capability, but it hinges on the availability of vast amounts of data. The data not only needs to exist but also has to be in a form that allows relationships between input features and output to be uncovered. Creating labels for each input feature fulfills this requirement but is an expensive undertaking.
Classical approaches to this problem rely on human and machine collaboration. In these approaches, engineered heuristics are used to smartly select “best” instances of data to label, in order to reduce cost. A human steps in to provide the label; the model then learns from this smaller labeled dataset. Recent advancements have made these approaches amenable to deep learning, enabling models to be built with limited labeled data.
Shioulin Sam shares algorithmic approaches that drive this capability and provides practical guidance for translating this capability into production. Join in to discover how and why these algorithms work through a live demo.
Shioulin Sam is a research engineer at Cloudera Fast Forward Labs, where she bridges academic research in machine learning with industrial applications. Previously, she managed a portfolio of early stage ventures focusing on women-led startups and public market investments and worked in the investment management industry designing quantitative trading strategies. She holds a PhD in electrical engineering and computer science from the Massachusetts Institute of Technology.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2019, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com