Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Why and how to leverage the power and simplicity of SQL on Apache Flink

Fabian Hueske (Ververica)
1:15pm–1:55pm Wednesday, 09/12/2018
Streaming systems & real-time applications
Location: 1E 07/08 Level: Intermediate
Average rating: *****
(5.00, 1 rating)

Who is this presentation for?

  • Data engineers and architects

Prerequisite knowledge

  • Familiarity with SQL and stream processing

What you'll learn

  • Learn how SQL unifies batch and stream processing and what it means to run a SQL query on a stream
  • Understand the scope of Flink's SQL support
  • Explore Flink's new query submission service

Description

Everybody working with data knows SQL. Apache Flink provides SQL support for querying and processing batch and streaming data. Flinkā€™s SQL support powers large-scale production systems at Alibaba, Huawei, and Uber. Based on Flink SQL, these companies have built systems for their internal users as well as publicly offered services for paying customers.

Fabian Hueske discusses why and how to leverage the simplicity and power of SQL on Flink. Fabian starts by exploring the use cases that Flink SQL was designed for and presents some real-world problems that it can solve. In particular, he explains why unified batch and stream processing is important and what it means to run SQL queries on streams of data. Fabian then demonstrates how to leverage Flink’s full potential.

Since the end of last year, the Flink community has been working on a service that integrates a query interface, (external) table catalogs, and result serving functionality for static, appending, and updating result sets. Fabian explores the design and features of this query service and details how it enables exploratory batch and streaming queries, ETL pipelines, and live updating query results that serve applications, such as real-time dashboards.

Photo of Fabian Hueske

Fabian Hueske

Ververica

Fabian Hueske is a committer and PMC member of the Apache Flink project. He was one of the three original authors of the Stratosphere research system, from which Apache Flink was forked in 2014. Fabian is a cofounder of Ververica, a Berlin-based startup devoted to fostering Flink, where he works as a software engineer and contributes to Apache Flink. He holds a PhD in computer science from TU Berlin and is currently spending a lot of his time writing a book, Stream Processing with Apache Flink.