Put AI to Work

April 15-18, 2019
New York, NY

Please log in

Add to Your Schedule

How to train your model (and catch label leakage)

Till Bergmann (Salesforce), Leah McGuire (Salesforce)

1:50pm–2:30pm Thursday, April 18, 2019

Case Studies, Machine Learning
Location: Sutton South

Who is this presentation for?

Data scientists, machine learning engineers, and product owners of ML products

Level

Intermediate

Prerequisite knowledge

A basic understanding of data (including messy data) and machine learning methods

What you'll learn

Explore techniques and methods to successfully identify and deal with label leakage and messy data

Description

Data or label leakage is a pervasive but often overlooked problem in predictive modeling on real-life data, and it takes on monstrous proportions at enterprise companies such as Salesforce that provide ML as a service to other businesses, where the data is populated by diverse and often unknown business processes, making it very hard for data scientists to distinguish cause from effect.

Till Bergmann and Leah McGuire explain how Salesforce—which needs to churn out thousands of customer-specific models for any given use case—tackled this problem. The automated approaches they describe are a part of our recently open sourced Spark-based library TransmogrifAI and extend the boundaries of what typically falls in the domain of automated machine learning.

Till Bergmann

Salesforce

Till Bergmann is a senior data scientist at Salesforce Einstein, building platforms to make it easier to integrate machine learning into Salesforce products, with a focus on automating many of the laborious steps in the machine learning pipeline. He holds a PhD in cognitive science from the University of California, Merced, where he studied the collaboration patterns of academics using NLP techniques.

Leah McGuire

Salesforce

Leah McGuire is a principal member of the technical staff at Salesforce Einstein, where she builds platforms to enable the integration of machine learning into Salesforce products. Previously, Leah was a senior data scientist on the data products team at LinkedIn working on personalization, entity resolution, and relevance for a variety of LinkedIn data products and completed a postdoctoral fellowship at the University of California, Berkeley. She holds a PhD in computational neuroscience from the University of California, San Francisco, where she studied the neural encoding and integration of sensory signals.

Presented by

Elite Sponsors

Strategic Sponsors

Contributing Sponsors

Business Summit Sponsor

Exabyte Sponsors

Diversity and Inclusion Sponsor

Impact Sponsors

Community Partners

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email aisponsorships@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of AI contacts

©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com