Sep 23–26, 2019

Turning Big Data into Knowledge: Managing metadata and data-relationships at Uber scale

Kaan Onuk (Uber), Luyao Li (Uber), Atul Gupte (Uber)
2:55pm3:35pm Wednesday, September 25, 2019
Location: 1A 23/24
Secondary topics:  Data quality, data governance and data lineage, Transportation and Logistics

Who is this presentation for?

Data Scientists, Engineers, Data Analysts, Product Managers




Uber takes data-driven to the next level with the complexity of its systems and breadth of data, processing trillions of Kafka messages per day, operating thousands of microservices, storing hundreds of petabytes of data in HDFS across multiple data centers, and supporting millions of weekly analytical queries.

Big data by itself, though, isn’t enough to leverage insights; to be used efficiently and effectively, data at Uber scale requires context to make business decisions and derive insights. To provide further insight, we built Databook, Uber’s in-house platform that surfaces and manages metadata and uStruct, the lineage platform that understands the end-to-end data flow and manages relationships across Uber’s mobile app to services, storage, and analytics.

In this talk, we will explore the current state of metadata/lineage management at Uber scale and what’s coming next in big data management.

Prerequisite knowledge

Basic Big Data concepts

What you'll learn

In this talk, we will discuss how we think about building big data knowledge platforms to allow teams to discover, manage and govern data entities. Specifically, we will explore how to build an extensible data management platform and infrastructure to democratize data at Uber scale.
Photo of Kaan Onuk

Kaan Onuk


Kaan Onuk is the Engineering Manager in the Data Platform Team at Uber. Previously, he worked as a tech lead at Uber, building metadata management infrastructure to transform big data into knowledge. Prior to Uber, he was the founding member of the data infrastructure team at Graphiq, a semantic technology startup which later acquired by Amazon to help improve Alexa. He holds a Master’s degree in Electrical Engineering from University of Southern California. Kaan can be reached on LinkedIn.

Photo of Luyao Li

Luyao Li


Luyao Li is a senior software engineer at Uber. He is an enthusiast of building reliable, scalable and performant systems. Prior to Uber, he architected EA’s global ad-campaign SAAS suite including Segmentation Manager and Engagement Manager on top of data from all franchises. He holds a master’s degree from Duke University.

Photo of Atul Gupte

Atul Gupte


Atul Gupte is a Product Manager on the Product Platform team at Uber. He holds a BS in Computer Science from the University of Illinois at Urbana-Champaign. At Uber, he helps drive product decisions to ensure Uber’s data science teams are able to achieve their full potential, by providing access to foundational infrastructure, stable compute resources & advanced tooling to power Uber’s global ambitions. Previously, at Zynga, he spent time building some of the world’s leading social games and also helped build out the company’s mobile advertising platform.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

For conference registration information and customer service

For more information on community discounts and trade opportunities with O’Reilly conferences

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of Strata Data Conference contacts