Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY

Foundations of streaming SQL; or, How I learned to love stream and table theory

Tyler Akidau (Google)
1:15pm1:55pm Thursday, September 28, 2017
Data Engineering & Architecture, Stream processing and analytics
Location: 1E 07/08 Level: Intermediate
Secondary topics:  Streaming
Average rating: ****.
(4.40, 5 ratings)

Who is this presentation for?

  • Anyone interested in data processing

Prerequisite knowledge

  • Familiarity with the Beam model and stream and table theory

What you'll learn

  • Understand the key concepts underpinning data processing
  • Learn what robust stream processing in SQL looks like

Description

What does it mean to execute streaming queries in SQL? What is the relationship of streaming queries to classic relational queries? Are streams and tables the same thing? And how does all of this relate to the programmatic frameworks we’re all familiar with? Tyler Akidau answers these questions and more as he walks you through key concepts underpinning data processing in general.

Tyler begins by exploring the relationship between the Beam model (as described in his paper “The Dataflow Mode” and the “Streaming 101” and “Streaming 102” blog posts) and stream and table theory (as popularized by Martin Kleppmann and Jay Kreps, among others). It turns out that stream and table theory does an illuminating job of describing the low-level concepts that underlie the Beam model.

Tyler then explains what is required to provide robust stream processing support in SQL, discussing the concrete efforts that have been made in this area by the Apache Beam, Calcite, and Flink communities, as well as new ideas yet to come. You’ll leave with a much better understanding of the key concepts underpinning data processing—regardless of whether that data processing is batch or streaming or SQL or programmatic—as well as a concrete notion of what robust stream processing in SQL looks like.

Photo of Tyler Akidau

Tyler Akidau

Google

Tyler Akidau is a senior staff software engineer at Google Seattle, where he leads technical infrastructure internal data processing teams for MillWheel and Flume. Tyler is a founding member of the Apache Beam PMC and has spent the last seven years working on massive-scale data processing systems. Though deeply passionate and vocal about the capabilities and importance of stream processing, he is also a firm believer that batch and streaming are two sides of the same coin and that the real endgame for data processing systems the seamless merging between the two. He is the author of the 2015 “Dataflow Model” paper and “Streaming 101” and “Streaming 102” blog posts. His preferred mode of transportation is by cargo bike, with his two young daughters in tow.