14–17 Oct 2019

The dangers of data leakage in production machine learning systems

Martin Goodson (Evolution AI)
13:4514:25 Wednesday, 16 October 2019
Location: Westminster Suite
Average rating: ****.
(4.75, 4 ratings)

Who is this presentation for?

  • Data scientists, engineers, and product managers

Level

Intermediate

Description

According to published research, data leakage is frequently found in public datasets, and it is likely to be at least as widespread in the private sector, where there’s less transparency.

Data leakage occurs when the model gains access to data that it shouldn’t have access to. AI systems can fail catastrophically in production if leakage is not dealt with properly. Martin Goodson details the main four manifestations of data leakage and explains how to recognize the warning signs. By mastering several key scientific principles, you can mitigate the risk of failure.

Prerequisite knowledge

  • Familiarity with supervised learning, classification, precision, recall, accuracy, cross-validation, and train and test split

What you'll learn

  • Learn the errors that data leakage causes and how to build systems that protect against common manifestations of data leakage
Photo of Martin Goodson

Martin Goodson

Evolution AI

Martin Goodson is the chief scientist and CEO of Evolution AI, where he specializes in large-scale natural language processing. Martin has designed data science products that are in use at companies like Dun & Bradstreet, Time Inc., John Lewis, and Condé Nast. Previously, Martin was a statistician at the University of Oxford, where he conducted research on statistical matching problems for DNA sequences. He runs the largest community of machine learning practitioners in Europe, Machine Learning London, and convenes the CBI/Royal Statistical Society roundtable, AI in Financial Services. Martin’s work has been covered by publications such as the Economist, Quartz, Business Insider, TechCrunch, and others.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

aisponsorships@oreilly.com

For information on exhibiting or sponsoring a conference

pr@oreilly.com

For media/analyst press inquires