Sep 23–26, 2019
Please log in

Turning big data into knowledge: Managing metadata and data relationships at Uber's scale

Kaan Onuk (Uber), Luyao Li (Uber), Atul Gupte (Uber)
2:55pm3:35pm Wednesday, September 25, 2019
Location: 1A 23/24
Average rating: ****.
(4.25, 8 ratings)

Who is this presentation for?

  • Data scientists, engineers, data analysts, and product managers




Uber takes data driven to the next level with the complexity of its systems and breadth of data, processing trillions of Kafka messages per day, operating thousands of microservices, storing hundreds of petabytes of data in Hadoop distributed file systems (HDFSs) across multiple data centers, and supporting millions of weekly analytical queries.

Kaan Onuk, Luyao Li, and Atul Gupte explore the current state of metadata and lineage management at Uber’s scale and share a sneak peak of what’s coming next in big data management.

Because big data by itself isn’t enough to leverage insights; to be used efficiently and effectively, data at Uber’s scale requires context to make business decisions and derive insights. To provide further insight, the company built Databook, Uber’s in-house platform that surfaces and manages metadata, and uStruct, the lineage platform that understands the end-to-end data flow and manages relationships across Uber’s mobile app to services, storage, and analytics.

Prerequisite knowledge

  • A basic understanding of big data concepts

What you'll learn

  • Discover how Uber thinks about building big data knowledge platforms to allow teams to discover, manage, and govern entities
  • Explore how to build an extensible data management platform and infrastructure to democratize data at Uber's scale
Photo of Kaan Onuk

Kaan Onuk


Kaan Onuk is an engineering manager at Uber, where he leads the metadata management team on the Big Data org. Previously, he was a tech lead at Uber, where he designed and built infrastructure to power data discovery and data privacy, and he helped build data infrastructure from the ground up at Graphiq, a startup acquired by Amazon. Kaan holds a master’s degree in electrical engineering from the University of Southern California.

Photo of Luyao Li

Luyao Li


Luyao Li is a technical lead manager on the data platform team at Uber, where he manages the data lineage team, which builds systems including end-to-end data flow tracking, latency tracking, and cost attribution and pricing. Previously he built multiple systems spanning from service discovery, configuration management, and ad campaign results tracking and reporting as a software engineer at Electronic Arts. He holds a master’s degree from Duke University.

Photo of Atul Gupte

Atul Gupte


Atul Gupte is a product manager on the product platform team at Uber, where he helps drive product decisions to ensure Uber’s data science teams are able to achieve their full potential by providing access to foundational infrastructure, stable compute resources, and advanced tooling to power Uber’s global ambitions. Previously, he built some of the world’s leading social games and helped build out the mobile advertising platform at Zynga. He holds a BS in computer science from the University of Illinois at Urbana-Champaign.

Comments on this page are now closed.


Anushka Jadhav | sr software engineer
10/09/2019 4:35pm EDT

+1 . Can you please post the slides.

Amanda Landrum | SR. Information Delivery Manager
10/09/2019 11:47am EDT

Can you share the slides you reviewed. Thanks much!

  • Cloudera
  • O'Reilly
  • Google Cloud
  • IBM
  • Cisco
  • Dataiku
  • Intel
  • Io-Tahoe
  • MemSQL
  • Microsoft Azure
  • Oracle Cloud Infrastructure
  • SAS
  • Arcadia Data
  • BMC Software
  • Hazelcast
  • SAP
  • Amazon Web Services
  • Anaconda
  • Esri
  •, Inc.
  • Kyligence
  • Pitney Bowes
  • Talend
  • Google Cloud
  • Confluent
  • DataStax
  • Dremio
  • Immuta
  • Impetus Technologies Inc.
  • Keyence
  • Kyvos Insights
  • StreamSets
  • Striim
  • Syncsort
  • SK holdings C&C

    Contact us

    For conference registration information and customer service

    For more information on community discounts and trade opportunities with O’Reilly conferences

    For information on exhibiting or sponsoring a conference

    For media/analyst press inquires