Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Introduction to Flink via Flink SQL

Fabian Hueske (Ververica)
9:00am12:30pm Tuesday, March 26, 2019
Secondary topics:  Streaming, realtime analytics, and IoT
Average rating: *****
(5.00, 1 rating)

Who is this presentation for?

  • Data engineers

Level

Beginner

Prerequisite knowledge

  • A basic understanding of SQL and stream processing

Materials or downloads needed in advance

  • A laptop with Docker and an IDE (e.g., IntelliJ) installed

What you'll learn

  • Learn how to run SQL on streaming data

Description

As data processing becomes more real time, stream processing is becoming more important. Apache Flink makes it easier to build and manage stream processing applications. Flink’s new SQL interface is a great way to get started with Flink—and to build and maintain production applications.

Fabian Hueske offers an overview of Apache Flink via the SQL interface, covering stream processing and Flink’s various modes of use. Then you’ll use Flink to run SQL queries on data streams and contrast this with the Flink DataStream API.

Outline:

Section 1

  • Survey of Apache Flink and its interfaces
  • Intro into SQL on Flink
  • Unified API for batch and streaming
  • Executing SQL queries on Flink
  • Documentation walkthrough

Section 2

  • Hands-on exercise: Setting up the SQL CLI client
  • Running the first queries
  • SQL on DataStreams
  • Tables, streams, and materialized views
  • Supported operations
  • Event time and processing time

Section 3

  • Hands-on exercise: Setting running queries on data streams
  • Windowed queries
  • Event time queries
  • Processing time queries
  • Materializing queries

Section 4

  • Flink APIs, internals, connectors, and UDFs
  • Table API and SQL
  • DataStream API
  • Hands-on exercise: Working with the DataStream API
Photo of Fabian Hueske

Fabian Hueske

Ververica

Fabian Hueske is a committer and PMC member of the Apache Flink project. He was one of the three original authors of the Stratosphere research system, from which Apache Flink was forked in 2014. Fabian is a cofounder of Ververica, a Berlin-based startup devoted to fostering Flink, where he works as a software engineer and contributes to Apache Flink. He holds a PhD in computer science from TU Berlin and is currently spending a lot of his time writing a book, Stream Processing with Apache Flink.