Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Bighead: Airbnb's end-to-end machine learning platform

Atul Kale (Airbnb), Xiaohan Zeng (Airbnb)
2:05pm–2:45pm Wednesday, 09/12/2018
Data science and machine learning
Location: 1A 08 Level: Beginner
Secondary topics:  Data Platforms, Model lifecycle management, Retail and e-commerce
Average rating: *****
(5.00, 3 ratings)

Who is this presentation for?

  • Engineers, product managers, and data scientists interested in machine learning and decision makers exploring ML tools

Prerequisite knowledge

  • A basic understanding of machine learning

What you'll learn

  • Explore Airbnb’s machine learning platform Bighead
  • Discover how the platform may serve your own needs
  • Understand some of the challenges that the company has faced and how it has overcome them

Description

Airbnb’s data-driven products present a wide variety of unique ML problems, ranging from traditional models built on structured data to state-of-the-art models that leverage unstructured data, such as user reviews, messages, and images. The ability to build, iterate on, and maintain healthy machine learning models is critical to Airbnb’s success.

An end-to-end solution typically needs to cover data collection, feature engineering, training, deploying, serving, and monitoring. Presently, few platforms are capable of doing all of the above in a user-friendly way. Moreover, the heterogeneous nature of ML problems and the requirement of scalability pose challenges to fast iteration and productionization.

Atul Kale and Xiaohan Zeng offer an overview of Bighead, Airbnb’s user-friendly and scalable end-to-end machine learning framework that powers Airbnb’s data-driven products. Bighead is built on Python, Spark, and Kubernetes. The components include a lifecycle management service, an offline training and inference engine, an online inference service, a prototyping environment, and a Docker image customization tool. Each component can be used individually. In addition, Bighead includes a unified model building API that smoothly integrates popular libraries including TensorFlow, XGBoost, and PyTorch. Each model is reproducible and iterable through standardization of data collection and transformation, model training environments, and production deployment.

Atul and Xiaohan explore Bighead’s architecture, detail the problems that each individual component and the overall system aim to solve, and outline a vision for the future of machine learning infrastructure. Bighead is widely adopted at Airbnb, with a variety of models in production, and has enabled the company to reduce model development time from months to days. Airbnb plans to open source Bighead to allow the broader community to benefit from this work.

Photo of Atul Kale

Atul Kale

Airbnb

Atul Kale is a software engineer on Airbnb’s machine learning infrastructure team. Previously, Atul worked in finance building machine learning-driven proprietary trading strategies and the data pipelines to support them. He holds a degree in computer engineering from the University of Illinois Urbana-Champaign.

Photo of Xiaohan Zeng

Xiaohan Zeng

Airbnb

Xiaohan Zeng is a software engineer on the machine learning infrastructure team at Airbnb. Previously, he worked on the machine learning platform team at Groupon. He holds a degree in chemical engineering from Tsinghua University and Northwestern University but started to pursue a career in software engineering and machine learning after doing research in data science. Outside work, he enjoys reading, writing, traveling, movies, and trying to follow his daughter around when she suddenly decides to practice walking.

Comments on this page are now closed.

Comments

Krishna Chaitanya | SENIOR ENGINEERING MANAGER
09/12/2018 10:19am EDT

Can you share the slides ?