Presented By O’Reilly and Cloudera

San Jose • London • New York

Make Data Work

March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Accelerating development velocity of production ML systems with Docker

Kinnary Jangla (Pinterest)

11:00am–11:40am Wednesday, March 7, 2018

Data engineering and architecture
Location: LL21 E/F

Average rating:

(2.25, 8 ratings)

Download slides (PDF)

Who is this presentation for?

Machine learning engineers, data scientists, managers working with ML, and site reliability engineers

Prerequisite knowledge

A basic understanding of microservice-based architecture

What you'll learn

Explore how Pinterest dockerized the services powering its home feed to accelerate development and decrease operational complexity

Description

The rise of microservices has allowed ML systems to grow in complexity but has also introduced new challenges when things inevitably go wrong. Most companies provide isolated development environments for engineers to work within. While a necessity once a team reaches even a small size, this same organizational choice introduces potentially frustrating dependencies when those individual environments inevitably drift.

Kinnary Jangla explains how Pinterest dockerized the services powering its home feed to accelerate development and decrease operational complexity and outlines the benefits Pinterest gained from this change that may be applicable to other microservice-based ML systems. This project was initially motivated by challenges arising from the difficulty of testing individual changes in a reproducible way. Without standardized environments, predeployment testing often yielded nonrepresentative results, causing downtime and confusion for those responsible for keeping the service up.

The Docker solution that was eventually deployed prepackages all dependencies found in each microservice, allowing developers to quickly set up large portions of the home feed stack and always test on the current team-wide configs. This architecture has enabled the team to debug latency issues, expand its testing suite to include connecting to simulated databases, and more quickly do development on its thrift APIs.

Kinnary shares tips and tricks for dockerizing a large-scale legacy production service and discusses how an architectural change like this can change how an ML team works.

Kinnary Jangla

Kinnary Jangla is a senior software engineer on the homefeed team at Pinterest, where she works on the machine learning infrastructure team as a backend engineer. Kinnary has worked in the industry for 10+ years. Previously, she worked on maps and international growth at Uber and on Bing search at Microsoft. Kinnary holds an MS in computer science from the University of Illinois and a BE from the University of Mumbai.

Comments on this page are now closed.

Comments

Kinnary Jangla | SENIOR SOFTWARE ENGINEER

03/12/2018 7:19am PDT

The slides have been uploaded.

Santosh Rao | SENIOR TECHNICAL DIRECTOR

03/10/2018 2:15am PST

Could you please upload slides. Thanks.

Kinnary Jangla | SENIOR SOFTWARE ENGINEER

03/07/2018 1:36am PST

Yes I will after the talk.

Byambasuren Ganbaatar | DATA ENGINEER

03/07/2018 12:40am PST

Do you share your slides?

Presented by

Elite Sponsors

Strategic Sponsors

Zettabyte Sponsor

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com