Presented By O’Reilly and Cloudera

San Jose • London • New York

Make Data Work

March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Lyft's analytics pipeline: From Redshift to Apache Hive and Presto

Shenghu Yang (Lyft)

4:20pm–5:00pm Thursday, March 8, 2018

Big data and data science in the cloud, Data engineering and architecture
Location: LL21 E/F

Average rating:

(5.00, 1 rating)

Download slides (PDF)

Who is this presentation for?

Data engineers, analysts, and data scientists

Prerequisite knowledge

A basic understanding of big data and business analytics

What you'll learn

Explore the evolution of Lyft's data pipeline, from AWS Redshift clusters to Apache Hive and Presto

Description

Lyft’s business has grown over 100x in the past four years. Shenghu Yang explains how Lyft’s data pipeline has evolved over the years to serve its ever-growing analytics use cases, migrating from the world’s largest AWS Redshift clusters to Apache Hive and Presto for solving scalability and concurrency hard limits.

Topics include:

How Lyft’s data pipeline evolved
A flexible architecture that shares storage and a metastore but separates computation
How Hive replaces Redshift ETL
How Presto complements Hive for ad hoc queries
Lyft’s self-service tools
How Lyft educates end users about its data systems

Shenghu Yang

Lyft

Shenghu Yang is an engineering manager at Lyft, where he was a founding member of the company’s data platform team and now runs the data tools team. Previously, Shenghu worked at Oracle and @WalmartLabs on cloud computing and digital marketing-related engineering work. He holds an MS from Carnegie Mellon University.

Website

Presented by

Elite Sponsors

Strategic Sponsors

Zettabyte Sponsor

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com