Talking to the machines: Monitoring production machine learning systems
Who is this presentation for?
- Data scientists and engineers that build or maintain production machine learning models
Level
IntermediateDescription
Production machine learning systems require constant monitoring not just to keep the system online but also to ensure the model inference results are correct. This is much more straightforward when user feedback or labels are available. In those cases, the model performance can be tracked and periodically reevaluated using standard metrics such as precision, recall, or area under the curve (AUC). But labeled data is often lacking. In many applications, labels are expensive to obtain (requiring human analysts’ manual review) or cannot be obtained in a timely manner (e.g., not available until weeks or months later).
Ting-Fang Yen describes the design and implementation of a real-time system to monitor production machine learning systems, which is designed to discover detection anomalies, such as volume spikes caused by spurious false positives, as well as gradual concept drifts when the model is no longer able to capture the target concept. In either case, you can automatically detect undesirable model behaviors early. Part of the approach borrows from signal processing techniques for time series decomposition where the time series can be used to represent a sequence of model decisions on different types of input data or the amount of deviation between consecutive model runs. By calculating cross-correlation among the identified anomalies, you can facilitate root cause analysis of the model behavior. This work is a step toward automated deployment of machine learning in production as well as new tools for interpreting model inference results.
Prerequisite knowledge
- General knowledge of machine learning technologies
- Experience working with large-scale datasets and systems
What you'll learn
- Gain an understanding of the practical challenges of deploying machine learning systems
- Discover experiences from maintaining a production machine learning system that handles over a billion requests per day on average
- See metrics for model quality when labeled data is not available

Ting-Fang Yen
DataVisor
Ting-Fang Yen is the director of research at DataVisor, the leading fraud, crime, and abuse-detection solution using unsupervised machine learning to detect fraudulent and malicious activity such as fake account registrations, fraudulent transactions, spam, account takeovers, and more. She has over 10 years of experience in applying big data analytics and machine learning to tackle problems in cybersecurity. Ting-Fang holds a PhD in electrical and computer engineering from Carnegie Mellon University.
Presented by
Elite Sponsors
Strategic Sponsors
Diversity and Inclusion Sponsor
Impact Sponsors
Premier Exhibitor Plus
R & D and Innovation Track Sponsor
Contact us
confreg@oreilly.com
For conference registration information and customer service
partners@oreilly.com
For more information on community discounts and trade opportunities with O’Reilly conferences
Become a sponsor
For information on exhibiting or sponsoring a conference
pr@oreilly.com
For media/analyst press inquires