Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Unlocking the world of stream processing with KSQL, the streaming SQL engine for Apache Kafka

Michael Noll (Confluent)
14:0514:45 Wednesday, 23 May 2018
Average rating: ****.
(4.67, 6 ratings)

Who is this presentation for?

  • Architects, vice presidents of engineering, CTOs, data engineers, data scientists, and application developers

Prerequisite knowledge

  • Familiarity with databases and big data technologies such as Apache Kafka and Hadoop

What you'll learn

  • Explore KSQL and learn how to use it to do stream processing without writing code in a programming language like Java or Scala

Description

Modern businesses have data at their core, and this data is changing continuously. Stream processing is what allows you harness this torrent of information in real time, and thousands of companies use Apache Kafka as the core platform for streaming data to transform and reshape their industries. However, the world of stream processing still has a very high barrier to entry. Today’s most popular stream processing technologies require the user to write code in programming languages such as Java or Scala. This hard requirement on coding skills is preventing many companies to unlock the benefits of stream processing to their full effect.

However, imagine that instead of having to write a lot of code in a programming language like Java or Scala for your favorite stream processing technology, all you’d need to get started with stream processing is a simple SQL statement, such as: SELECT * FROM payments-kafka-stream WHERE fraudProbability > 0.8.

Michael Noll offers an overview of KSQL, the open source streaming SQL engine for Apache Kafka, which makes it easy to get started with a wide range of real-time use cases, such as monitoring application behavior and infrastructure, detecting anomalies and fraudulent activities in data feeds, and real-time ETL. With KSQL, there’s no need to write any code in a programming language. KSQL brings together the worlds of streams and databases by allowing you to work with your data in a stream and in a table format. Built on top of Kafka’s Streams API, KSQL supports many powerful operations, including filtering, transformations, aggregations, joins, windowing, sessionization, and much more. It is open source (Apache 2.0 licensed), distributed, scalable, fault tolerant, and real time. You’ll learn how KSQL makes it easy to get started with a wide range of stream processing use cases and how to get up and running as you explore how it all works under the hood.

Photo of Michael Noll

Michael Noll

Confluent

Michael Noll is the technologist of the office of the CTO at Confluent, the company founded by the creators of Apache Kafka. Previously, Michael was the technical lead of DNS operator Verisign’s big data platform, where he grew the Hadoop, Kafka, and Storm-based infrastructure from zero to petabyte-sized production clusters spanning multiple data centers—one of the largest big data infrastructures in Europe at the time. He’s a well-known tech blogger in the big data community. In his spare time, Michael serves as a technical reviewer for publishers such as Manning and is a frequent speaker at international conferences, including Strata, ApacheCon, and ACM SIGIR. Michael holds a PhD in computer science.

Comments on this page are now closed.

Comments

Picture of Michael Noll
Michael Noll | TECHNOLOGIST, OFFICE OF THE CTO
30/05/2018 12:31 BST

Slides are available at https://www.slideshare.net/miguno/unlocking-the-world-of-stream-processing-with-ksql-the-streaming-sql-engine-for-apache-kafka-98644313