Presented By O’Reilly and Cloudera

San Francisco • London • New York

Make Data Work

September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Near-real-time anomaly detection at Lyft

Thomas Weise (Lyft), Mark Grover (Lyft)

11:20am–12:00pm Thursday, 09/13/2018

Data engineering and architecture, Streaming systems & real-time applications
Location: 1E 07/08 Level: Beginner

Secondary topics: Temporal data and time-series analytics, Transportation and Logistics

Average rating:

(2.50, 2 ratings)

Who is this presentation for?

Data engineers, data scientists, architects, and technical decision makers

Prerequisite knowledge

Basic familiarity with big data processing use cases

What you'll learn

Explore Lyft’s streaming platform and see how Lyft uses it to perform anomaly detection
Understand how data science and data engineering processes can be brought together for faster outcomes

Description

Consumer-facing real-time processing poses a number of challenges to protect against fraudulent transactions and other risks. The streaming platform at Lyft seeks to support this with an architecture that brings together a data science-friendly programming environment with a deployment stack for the reliability, scalability, and other SLA requirements of a mission-critical stream processing system.

Thomas Weise and Mark Grover explain how Lyft uses its streaming platform to detect and respond to anomalous events. Reacting to such events with traditional development methodologies is challenging, especially where low-latency SLAs for instant user feedback are critically important. Enablement of data science tools for machine learning and a process that allows for fast and predictable deployment is of growing importance.

Topics include:

A deep dive into Lyft’s streaming platform, covering use cases, system architecture, and key requirements that drive technology choices
Examples for risk and fraud analysis of real-time transaction streams, including credit cards and location, based on machine learning models and historical data
A data scientist-friendly development environment with the Python ecosystem and tools that allow users to focus on business logic
An Apache Beam portability framework as bridge to distributed execution without code rewrites for a JVM-based target streaming engine
A data engineering process for continuous integration and deployment with reliability and operability focus
Apache Flink-based streaming execution for scalability, high availability, and low-latency processing

Thomas Weise

Lyft

Thomas Weise is a software engineer for the streaming platform at Lyft. He’s also a PMC member for the Apache Apex and Apache Beam projects and has contributed to several more projects within the ASF ecosystem. Thomas is a frequent speaker at international big data conferences and the author of Learning Apache Apex.

Website

Mark Grover

Lyft

Mark Grover is a product manager at Lyft. Mark’s a committer on Apache Bigtop, a committer and PPMC member on Apache Spot (incubating), and a committer and PMC member on Apache Sentry. He’s also contributed to a number of open source projects, including Apache Hadoop, Apache Hive, Apache Sqoop, and Apache Flume. He’s a coauthor of Hadoop Application Architectures and wrote a section in Programming Hive. Mark is a sought-after speaker on topics related to big data. He occasionally blogs on topics related to technology.

Comments on this page are now closed.

Comments

Mark Grover | PRODUCT MANAGER

09/13/2018 5:40am EDT

Hi all,
We are super excited to see you all real soon! You won’t want to miss this.

The slides are posted at go.lyft.com/streaming-at-lyft

Presented by

Elite Sponsors

Strategic Sponsors

Zettabyte Sponsors

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com