Sep 23–26, 2019

Turning big data into knowledge: Managing metadata and data relationships at Uber's scale

Kaan Onuk (Uber), Luyao Li (Uber), Atul Gupte (Uber)
2:55pm3:35pm Wednesday, September 25, 2019
Location: 1A 23

Who is this presentation for?

  • Data scientists, engineers, data analysts, and product managers

Level

Intermediate

Description

Uber takes data driven to the next level with the complexity of its systems and breadth of data, processing trillions of Kafka messages per day, operating thousands of microservices, storing hundreds of petabytes of data in Hadoop distributed file systems (HDFSs) across multiple data centers, and supporting millions of weekly analytical queries.

Kaan Onuk, Luyao Li, and Atul Gupte explore the current state of metadata and lineage management at Uber’s scale and share a sneak peak of what’s coming next in big data management.

Because big data by itself isn’t enough to leverage insights; to be used efficiently and effectively, data at Uber’s scale requires context to make business decisions and derive insights. To provide further insight, the company built Databook, Uber’s in-house platform that surfaces and manages metadata, and uStruct, the lineage platform that understands the end-to-end data flow and manages relationships across Uber’s mobile app to services, storage, and analytics.

Prerequisite knowledge

  • A basic understanding of big data concepts

What you'll learn

  • Discover how Uber thinks about building big data knowledge platforms to allow teams to discover, manage, and govern entities
  • Explore how to build an extensible data management platform and infrastructure to democratize data at Uber's scale
Photo of Kaan Onuk

Kaan Onuk

Uber

Kaan Onuk is an engineering manager at Uber, where he leads the metadata management team on the Big Data org. Previously, he was a tech lead at Uber, where he designed and built infrastructure to power data discovery and data privacy, and he helped build data infrastructure from the ground up at Graphiq, a startup acquired by Amazon. Kaan holds a master’s degree in electrical engineering from the University of Southern California.

Photo of Luyao Li

Luyao Li

Uber

Luyao Li is a technical lead manager on the data platform team at Uber, where he manages the data lineage team, which builds systems including end-to-end data flow tracking, latency tracking, and cost attribution and pricing. Previously he built multiple systems spanning from service discovery, configuration management, and ad campaign results tracking and reporting as a software engineer at Electronic Arts. He holds a master’s degree from Duke University.

Photo of Atul Gupte

Atul Gupte

Uber

Atul Gupte is a product manager on the product platform team at Uber, where he helps drive product decisions to ensure Uber’s data science teams are able to achieve their full potential by providing access to foundational infrastructure, stable compute resources, and advanced tooling to power Uber’s global ambitions. Previously, he built some of the world’s leading social games and helped build out the mobile advertising platform at Zynga. He holds a BS in computer science from the University of Illinois at Urbana-Champaign.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

strataconf@oreilly.com

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of Strata Data Conference contacts