Most companies tightly couple their experimentation logs (who is seeing which experiment) with the business metrics needed to assess the impact of experiments (like completed trips, retention rate, and cost per trip). By running pipelines precomputing experimentation results for all experiments on a regular cadence (typically once a day), they fulfill basic experimentation needs. This approach is great for companies with a few experiments and who always look at the same set of metrics across all experiments.
But what happens when new metrics need to be onboarded? When too many experiments are running at the same time, making the pipelines prone to break?
Given the pace at which Uber operates, the metrics needed to assess the impact of experiments constantly evolve. Milene Darnis explains how the team built a scalable and self-serve platform that lets users plug in any metric to analyze. Milene covers architecture choices for the experimentation platform that enable users to self-onboard their experimentation metrics.
It all starts from the logs. Uber’s experimentation team relies on Kafka and Spark to ensure they consistently track who is seeing which experiment and when. Milene then explains why logs were decoupled from the metrics needed to analyze experiments and why the team decided to move away from precomputing the same set of metrics for all experiments and built a framework letting people write their own SQL in a templated way. You’ll learn the power of summary tables and how the team used Hive to build “smart” aggregate tables that can be easily joined to any self-onboarded metric, effectively giving users the ability to pick and choose the metrics to analyze for each experiment. Finally, you’ll see how the team leveraged Presto and an async architecture to render 99% of experimentation reports in less than two minutes. Milene concludes with some thoughts on what’s next for the experimentation reporting tool at Uber.
Milene Darnis is a data product manager at Uber, focusing on building a world-class experimentation platform. Previously, she was a data engineer at Uber, where she modeled core datasets, and a business intelligence engineer at a mobile gaming company. Milene is passionate about linking data to concrete business problems. She holds a master’s degree in engineering from Telecom ParisTech, France.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com