Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

A/B testing at Uber: How we built a BYOM (bring your own metrics) platform

Milene Darnis (Uber)
1:10pm–1:50pm Thursday, 09/13/2018
Data engineering and architecture
Location: 1A 21/22 Level: Intermediate
Secondary topics:  Data Platforms, Transportation and Logistics
Average rating: ****.
(4.22, 9 ratings)

Who is this presentation for?

  • Data engineers, analysts, and data scientists

Prerequisite knowledge

  • A basic understanding of SQL and data engineering design principles
  • Familiarity with A/B testing methodologies (useful but not required)

What you'll learn

  • Learn how Uber decoupled experimentation logs from business metrics to successfully scale its experimentation reporting platform


Most companies tightly couple their experimentation logs (who is seeing which experiment) with the business metrics needed to assess the impact of experiments (like completed trips, retention rate, and cost per trip). By running pipelines precomputing experimentation results for all experiments on a regular cadence (typically once a day), they fulfill basic experimentation needs. This approach is great for companies with a few experiments and who always look at the same set of metrics across all experiments.

But what happens when new metrics need to be onboarded? When too many experiments are running at the same time, making the pipelines prone to break?

Given the pace at which Uber operates, the metrics needed to assess the impact of experiments constantly evolve. Milene Darnis explains how the team built a scalable and self-serve platform that lets users plug in any metric to analyze. Milene covers architecture choices for the experimentation platform that enable users to self-onboard their experimentation metrics.

It all starts from the logs. Uber’s experimentation team relies on Kafka and Spark to ensure they consistently track who is seeing which experiment and when. Milene then explains why logs were decoupled from the metrics needed to analyze experiments and why the team decided to move away from precomputing the same set of metrics for all experiments and built a framework letting people write their own SQL in a templated way. You’ll learn the power of summary tables and how the team used Hive to build “smart” aggregate tables that can be easily joined to any self-onboarded metric, effectively giving users the ability to pick and choose the metrics to analyze for each experiment. Finally, you’ll see how the team leveraged Presto and an async architecture to render 99% of experimentation reports in less than two minutes. Milene concludes with some thoughts on what’s next for the experimentation reporting tool at Uber.

Photo of Milene Darnis

Milene Darnis


Milene Darnis is a data product manager at Uber, focusing on building a world-class experimentation platform. Previously, she was a data engineer at Uber, where she modeled core datasets, and a business intelligence engineer at a mobile gaming company. Milene is passionate about linking data to concrete business problems. She holds a master’s degree in engineering from Telecom ParisTech, France.