Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Accelerating development velocity of production ML systems with Docker

Kinnary Jangla (Pinterest)
11:1511:55 Thursday, 24 May 2018
Secondary topics:  Data Platforms, Managing and Deploying Machine Learning, Media, Advertising, Entertainment
Average rating: ***..
(3.00, 5 ratings)

Who is this presentation for?

  • Machine learning engineers, data scientists, managers working with ML, and site reliability engineers

Prerequisite knowledge

  • A basic understanding of microservice-based architecture

What you'll learn

  • Learn how Pinterest dockerized the microservices powering its home feed to accelerate development and decrease operational complexity
  • Explore tips and tricks to help you do the same

Description

The rise of microservices has allowed ML systems to grow in complexity but has also introduced new challenges when things inevitably go wrong. Kinnary Jangla explains how Pinterest dockerized the microservices powering its home feed to accelerate development and decrease operational complexity and outlines benefits gained from this change that may be applicable to other microservice-based ML systems. You’ll learn tips and tricks for dockerizing your large-scale legacy production services and how an architectural change like this can change how your ML team works.

Most companies provide isolated development environments for engineers to work within. While a necessity once a team reaches even a small size, this same organizational choice introduces potentially frustrating dependencies when those individual environments inevitably drift. Pinterest’s project was initially motivated by challenges arising from the difficulty of testing individual changes in a reproducible way. Without having standardized environments, predeployment testing often yielded nonrepresentative results, causing downtime and confusion for those responsible for keeping the service up.

The Docker solution that was eventually deployed prepackages all dependencies found in each microservice, allowing developers to quickly set up large portions of the home feed stack and always test on the current team-wide configs. This architecture enables the team to debug latency issues, expand its testing suite to include connecting to simulated databases, and more quickly do development on its Thrift APIs.

Photo of Kinnary Jangla

Kinnary Jangla

Pinterest

Kinnary Jangla is a senior software engineer on the home feed team at Pinterest, where she works on the machine learning infrastructure team as a backend engineer. Kinnary has worked in the industry for 10+ years. Previously, she worked on maps and international growth at Uber and on Bing search at Microsoft. She is the author of two books. Kinnary holds an MS in computer science from the University of Illinois and a BE from the University of Mumbai.