Mar 15–18, 2020

How Lyft built a streaming data platform on Kubernetes

Micah Wylde (Lyft)
2:35pm3:15pm Wednesday, March 18, 2020
Location: LL21 C

Who is this presentation for?

Data engineers, data architects, developers

Level

Intermediate

Description

Access to real-time data is increasingly important for many organizations. This is particularly true for Lyft, which needs to respond immediately to changes of supply and demand in its marketplace, weather and traffic updates, fraud attempts, and dangerous driving situations. This requires processing millions of events per second produced by the microservices and mobile apps.

Micah Wylde explains how Lyft runs dozens of Apache Flink and Apache Beam pipelines. Flink provides a powerful framework that makes it easy for nonexperts to write correct, high-scale streaming jobs, while Beam extends that power to Lyft’s large base of Python programmers. Lyft also built a real-time SQL engine called Dryft, primarily used by data scientists to power real-time machine learning models, and a near-real-time ad hoc querying system with Presto.

Historically, Lyft ran its Flink clusters on bare, custom-managed EC2 instances. In order to achieve greater elasticity and reliability, it rebuilt its streaming platform on top of Kubernetes. You’ll discover how Lyft designed and built an open source Kubernetes operator for Flink and Beam, some of the unique challenges of running a complex, stateful application on Kubernetes, and the lessons learned along the way.

Prerequisite knowledge

  • Familiarity with data processing technologies

What you'll learn

  • Discover how you can use Flink, Beam, and Kubernetes to empower your engineers and data scientists to take advantage of real-time data
Photo of Micah Wylde

Micah Wylde

Lyft

Micah Wylde is a software engineer on the streaming compute team at Lyft, focused on the development of Apache Flink and Apache Beam. Previously, he built data infrastructure for fighting internet fraud at SIFT and real-time bidding infrastructure for ads at Quantcast.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

pr@oreilly.com

For media/analyst press inquires