Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Zipline: Airbnb's data management platform for machine learning

Varant Zanoyan (Airbnb)
2:55pm–3:35pm Wednesday, 09/12/2018
Data engineering and architecture
Location: 1A 21/22 Level: Intermediate
Secondary topics:  Data Platforms, Retail and e-commerce
Average rating: ****.
(4.33, 6 ratings)

Who is this presentation for?

  • Data scientists and engineers

Prerequisite knowledge

  • Familiarity with problems regarding creating and launching ML models to production (e.g., difficulty in creating training data at scale)

What you'll learn

  • Explore Zipline, Airbnb’s data management platform specifically designed for ML use cases
  • Understand how to solve problems regarding training data generation with point-in-time correctness, feature consistency for online scoring, collaborating on training data, and data management

Description

Zipline is Airbnb’s data management platform specifically designed for ML use cases. Previously, ML practitioners at Airbnb spent roughly 60% of their time collecting and writing transformations for machine learning tasks. Zipline reduces this task from months to about a day. It allows users to define features in a easy-to-use configuration language, then provides access to the following features:

  • Resource efficient and point-in-time correct training set backfills and scheduled updates
  • Feature visualizations and automatic data quality monitoring
  • Feature availability in online scoring environment: Batch and streaming
  • Batch correction (lambda architecture)
  • Collaboration and sharing of features
  • Data ownership and management

Varant Zanoyan covers Zipline’s architecture and dives into how it solves ML-specific problems. Despite being widespread, there are no open source solutions to these kinds of problems. As a result, Airbnb intends to open-source Zipline in the near future.

Photo of Varant Zanoyan

Varant Zanoyan

Airbnb

Varant Zanoyan is a software engineer on the ML Infrastructure team at Airbnb, where he works on tools and frameworks for building and productionizing ML models. Previously, he solved data infrastructure problems at Palantir Technologies.

Comments on this page are now closed.

Comments

Byambasuren Ganbaatar | DATA ENGINEER
09/17/2018 11:04pm EDT

Where did you upload it? I don’t see any download button here.

Picture of Varant Zanoyan
Varant Zanoyan | SOFTWARE ENGINEER
09/17/2018 9:06am EDT

Slides are uploaded

Byambasuren Ganbaatar | DATA ENGINEER
09/16/2018 5:28am EDT

Please share slides