Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK
Please log in

The Lyft data platform: Now and in the future

Mark Grover (Lyft), Deepak Tiwari (Lyft)
14:5515:35 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Average rating: ****.
(4.69, 13 ratings)

Who is this presentation for?

  • Those working in the big data space



Prerequisite knowledge

  • Basic knowledge of data platform architecture

What you'll learn

  • Understand how to develop an effective scalable data platform
  • Explore Lyft's platform and learn how it's continuing to evolve


Lyft’s data platform is at the heart of the company’s business. Decisions all the way from pricing to ETA to business operations rely on Lyft’s data platform. Moreover, it powers the enormous scale and speed at which Lyft operates.

Mark Grover and Deepak Tiwari cover the technologies Lyft uses for ETL, ad hoc querying, stream ingestion, stream processing, visualization, ML model training, and ML model development. Some of these technologies are open source (Hive, Presto, Spark), and some are homegrown (ML model training and model development engines, for example). Mark and Deepak also discuss other core facets of the data platform, including security, data discovery, and lineage, and explain why Lyft adopted open source tools in some cases and why it decided to build its own on others as well as how those choices have evolved over the years. They conclude with a glimpse of what lies ahead in the future.

Photo of Mark Grover

Mark Grover


Mark Grover is a product manager at Lyft. Mark’s a committer on Apache Bigtop, a committer and PPMC member on Apache Spot (incubating), and a committer and PMC member on Apache Sentry. He’s also contributed to a number of open source projects, including Apache Hadoop, Apache Hive, Apache Sqoop, and Apache Flume. He’s a coauthor of Hadoop Application Architectures and wrote a section in Programming Hive. Mark is a sought-after speaker on topics related to big data. He occasionally blogs on topics related to technology.

Photo of Deepak Tiwari

Deepak Tiwari


Deepak Tiwari is the head of product management for data at Lyft, where he’s responsible for the company’s data vision as well as for building its data infrastructure, data platform, and data products. This includes Lyft’s streaming infrastructure for real-time decision making, geodata store and visualization, platform for machine learning, and core infrastructure for big data analytics. Previously, he was a product management leader at Google, where he worked on search, cloud, and technical infrastructure products. Deepak is passionate about building products that are driven by data, focus on user experience, and work at web scale. He holds an MBA from Northwestern’s Kellogg School of Management and a BT in engineering from the Indian Institute of Technology, Kharagpur.

Comments on this page are now closed.


Picture of Mark Grover
1/05/2019 17:58 BST

Hi all,
Thanks for attending. Slides are at