Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA
Please log in

Online evaluation of machine learning models

Ted Dunning (MapR, now part of HPE)
2:40pm3:20pm Wednesday, March 27, 2019
Secondary topics:  Model lifecycle management
Average rating: ****.
(4.70, 10 ratings)

Who is this presentation for?

  • Data scientists, data engineers, and DataOps team members



Prerequisite knowledge

  • Basic knowledge of what machine learning systems do

What you'll learn

  • Understand why monitoring machine learning-based systems is different from monitoring conventional systems
  • Learn practical methods for monitoring real-world systems


Academic machine learning almost exclusively involves offline evaluation of machine learning models. In the real world this is, somewhat surprisingly, only good enough for a rough cut that eliminates the real dogs. For production work, online evaluation is often the only option to determine which of several final-round candidates might be chosen for further use.

As Einstein is rumored to have said, theory and practice are the same, in theory. In practice, they are different. So it is with models. Part of the problem is interaction with other models and systems. Part of the problem has to do with the variability of the real world. Often, there are adversaries at work. It may even be sunspots. One particular problem arises when models choose their own training data and thus couple back onto themselves.

In addition to these difficulties, production models almost always have service-level agreements that have to do with how quickly they must produce results and how often they are allowed to fail. These operational considerations can be as important as the accuracy of the model: the right results returned late are worse than slightly wrong results returned in time.

Ted Dunning offers a survey of useful ways to evaluate models in the real world, breaking the problem of evaluation apart into operational and function evaluation and demonstrating how to do each without unnecessary pain and suffering. You’ll learn about decoy and canary models, nonlinear latency histogramming, model-delta diagrams, and more. These techniques may sound arcane, but each is simple at heart and doesn’t require any advanced mathematics to understand. Along the way, he shares exciting visualization techniques that will help make differences strikingly apparent.

Photo of Ted Dunning

Ted Dunning

MapR, now part of HPE

Ted Dunning is the chief technology officer at MapR, an HPE company. He’s also a board member for the Apache Software Foundation, a PMC member, and committer on a number of projects. Ted has years of experience with machine learning and other big data solutions across a range of sectors. He’s contributed to clustering, classification, and matrix decomposition algorithms in Mahout and to the new Mahout Math library and designed the t-digest algorithm used in several open source projects and by a variety of companies. Previously, Ted was chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems and built fraud-detection systems for ID Analytics (LifeLock). Ted has coauthored a number of books on big data topics, including several published by O’Reilly related to machine learning, and has 24 issued patents to date plus a dozen pending. He holds a PhD in computing science from the University of Sheffield. When he’s not doing data science, he plays guitar and mandolin. He also bought the beer at the first Hadoop user group meeting.