San FranciscoLondon New York

Presented By
O’Reilly + Cloudera

Make Data Work

March 25-28, 2019
San Francisco, CA

Please log in

Add to Your Schedule

Talking to the machines: Monitoring production machine learning systems

Ting-Fang Yen (DataVisor)

5:10pm–5:50pm Wednesday, March 27, 2019

Data Science, Machine Learning & AI
Location: 2011

Secondary topics: Automation in data science and big data, Model lifecycle management, Temporal data and time-series analytics

Average rating:

(4.00, 3 ratings)

Who is this presentation for?

Data scientists and engineers that build and maintain production machine learning models

Level

Intermediate

Prerequisite knowledge

A general understanding of machine learning technologies
Experience working with large-scale datasets and systems

What you'll learn

Understand the practical challenges of deploying machine learning systems
Learn how to maintain a production machine learning system that handles over a billion requests per day on average
Explore metrics for model quality when labeled data isn't available

Description

Production machine learning systems require constant monitoring, not just to keep the system online but also to ensure the model inference results are correct. This is much more straightforward when user feedback or labels are available. In those cases, the model performance can be tracked and periodically reevaluated using standard metrics such as precision, recall, or AUC. But what about when labeled data is lacking? In many applications, labels are expensive to obtain (requiring human analysts’ manual review) or cannot be obtained in a timely manner (e.g., not available until weeks or months later).

Ting-Fang Yen discusses the design and implementation of a real-time system to monitor production machine learning systems. The approach is designed to discover detection anomalies, such as volume spikes caused by spurious false positives, as well as gradual concept drifts when the model is no longer able to capture the target concept. In either case, it is able to automatically detect undesirable model behaviors early.

Part of the approach borrows from signal processing techniques for time series decomposition, where the time series can be used to represent a sequence of model decisions on different types of input data, or the amount of deviation between consecutive model runs. The approach calculates cross-correlation among the identified anomalies to facilitate root cause analysis of the model behavior.

This work is a step toward automated deployment of machine learning in production as well as new tools for interpreting model inference results.

Ting-Fang Yen

DataVisor

Ting-Fang Yen is the director of research at DataVisor, the leading fraud, crime, and abuse-detection solution using unsupervised machine learning to detect fraudulent and malicious activity such as fake account registrations, fraudulent transactions, spam, account takeovers, and more. She has over 10 years of experience in applying big data analytics and machine learning to tackle problems in cybersecurity. Ting-Fang holds a PhD in electrical and computer engineering from Carnegie Mellon University.

Comments on this page are now closed.

Comments

Olga Sattler | HEADHUNTER

03/25/2019 9:55am PDT

Ting-Fang,

I saw that you are a speaker at the Strata Conference; I’m looking forward to your presentation. I have been following your company for the last couple of months and I am very impressed with your success. We have also spoke a few times on LinkedIn.

I would love to introduce myself and discuss a strategic partnership with your company. I have helped many companies in the Big Data and Analytics space such as Amazon, Bosch, Samsung, Graphcore, LG Eletronics MapR, Platfora, Couchbase, Snowflake, MongoDB, E8 Security with their challenging hiring needs.

I am looking forward to meeting you soon. Good luck with your speech.

Thanks,

Presented by

Strategic Sponsors

Zettabyte Sponsor

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsor

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com