Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Better machine learning logistics with the rendezvous architecture

Ted Dunning (MapR)
5:10pm5:50pm Wednesday, March 7, 2018
Average rating: *****
(5.00, 1 rating)

Who is this presentation for?

  • Data architects, developers, and DataOps and ops team members

Prerequisite knowledge

  • A basic understanding of models

What you'll learn

  • Explore the rendezvous architecture
  • Learn a number of techniques for clean hand-off of system components
  • Understand what decoy and canary servers are and what they are used for

Description

Most of the effort in machine learning goes into everything except the learning. Dealing with these overhead tasks well makes a large difference in results, if only because it increases the amount of time you can think about the real problems.

Ted Dunning offers an overview of the rendezvous architecture, developed to be the “continuous integration” system for machine learning models, describing the motivation and design of the rendezvous architecture and giving a user’s-eye view of how it feels to roll new services into production. The rendezvous architecture allows always-hot zero latency rollout and rollback of new models and supports extensive metrics and diagnostics so models can be compared as they process production data. It can even hot-swap the framework itself with no downtime. Best of all, the rendezvous architecture is simple and understandable.

Photo of Ted Dunning

Ted Dunning

MapR

Ted Dunning is the chief technology officer at MapR. He’s also a board member for the Apache Software Foundation; a PMC member and committer of the Apache Mahout, Apache Zookeeper, and Apache Drill projects; and a mentor for various incubator projects. Ted has years of experience with machine learning and other big data solutions across a range of sectors. He’s contributed to clustering, classification, and matrix decomposition algorithms in Mahout and to the new Mahout Math library and designed the t-digest algorithm used in several open source projects and by a variety of companies. Previously, Ted was chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems and built fraud-detection systems for ID Analytics (LifeLock). Ted has coauthored a number of books on big data topics, including several published by O’Reilly related to machine learning, and has 24 issued patents to date plus a dozen pending. He holds a PhD in computing science from the University of Sheffield. When he’s not doing data science, he plays guitar and mandolin. He also bought the beer at the first Hadoop user group meeting.