Online evaluation of machine learning models
Who is this presentation for?
- Data scientists, data engineers, developers, and technical leads
Academic machine learning involves almost exclusively offline evaluation of machine learning models. In the real world this is, somewhat surprisingly, often only good enough for a rough cut that eliminates the real dogs. For production work, online evaluation is often the only option to determine which of several final-round candidates might be chosen for further use. As Einstein is rumored to have said, theory and practice are the same, in theory. In practice, they’re different. So it is with models. Part of the problem is interaction with other models and systems. And part of the problem has to do with variability of the real world. Often there are adversaries at work. It may even be sunspots. One particular problem arises when models choose their own training data and thus couple back onto themselves.
In addition to these difficulties, production models almost always have service-level agreements that have to do with how quickly they must produce results and how often they are allowed to fail. These operational considerations can be as important as the accuracy of the model…right results returned late are worse than slightly wrong results returned in time.
Ted Dunning explores a survey of useful ways to evaluate models in real-world use, including decoy and canary models, nonlinear latency histogramming, model-delta diagrams, and more. These techniques may sound arcane, but each has a simple heart and should not require any advanced mathematics to understand.
- General knowledge of machine learning
- Experience with a production machine learning system (useful but not required)
What you'll learn
- Discover why evaluating machine learning systems is hard and how to make it easier
MapR, now part of HPE
Ted Dunning is the chief technology officer at MapR, an HPE company. He’s also a board member for the Apache Software Foundation, a PMC member, and committer on a number of projects. Ted has years of experience with machine learning and other big data solutions across a range of sectors. He’s contributed to clustering, classification, and matrix decomposition algorithms in Mahout and to the new Mahout Math library and designed the t-digest algorithm used in several open source projects and by a variety of companies. Previously, Ted was chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems and built fraud-detection systems for ID Analytics (LifeLock). Ted has coauthored a number of books on big data topics, including several published by O’Reilly related to machine learning, and has 24 issued patents to date plus a dozen pending. He holds a PhD in computing science from the University of Sheffield. When he’s not doing data science, he plays guitar and mandolin. He also bought the beer at the first Hadoop user group meeting.
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
For media/analyst press inquires