Sep 23–26, 2019

Real-time SQL Stream Processing at Scale with Apache Kafka and KSQL

Ricardo Ferreira (Confluent)
9:00am12:30pm Tuesday, September 24, 2019
Location: 1E 10
Secondary topics:  Data Integration and Data Processing, Deep dive into specific tools, platforms, or frameworks, Streaming and IoT

Who is this presentation for?

Data Engineers, Developers, DBAs



Prerequisite knowledge

SQL, fundamentals of databases, basic of Linux/Shell. Docker

Materials or downloads needed in advance

Attendees will need their own laptop and have completed the steps at

What you'll learn

Best practices around building pipelines with Apache Kafka - How to use just config and SQL to build complete ETL pipelines - Patterns for integration with databases - Anti-patterns to be aware of


Have you ever thought that you needed to be a programmer to do stream processing and build streaming data pipelines? Think again! Apache Kafka is a distributed, scalable, and fault-tolerant streaming platform, providing low-latency pub-sub messaging coupled with native storage and stream processing capabilities. Integrating Kafka with RDBMS, NoSQL, and object stores is simple with Kafka Connect, which is part of Apache Kafka. KSQL is the open-source SQL streaming engine for Apache Kafka and makes it possible to build stream processing applications at scale, written using a familiar SQL interface.

In this workshop, you will learn the architectural reasoning for Apache Kafka and the benefits of real-time integration, and then build a streaming data pipeline using nothing but your bare hands, Kafka Connect, and KSQL. Gasp as we filter events in real time! Be amazed at how we can enrich streams of data with data from RDBMS! Be astonished at the power of streaming aggregates for anomaly detection!

Photo of Ricardo Ferreira

Ricardo Ferreira


Ricardo is a Developer Advocate at Confluent — the company founded by the creators of Apache Kafka. He has +21 years of experience working with Software Development, where he specialized in different Distributed Systems architectures such as Integration, SOA, NoSQL, Messaging, In-Memory Caching and Cloud Computing. Prior to Confluent he worked for other vendors such as Oracle, Red Hat and IONA Technologies, as well as several consulting firms. While not working and like any good Brazilian — he loves doing Churrasco’s (i.e.: Brazilian Barbecue) with his friends & family, where he get the chance to talk about anything that is not geek related. He can be easily found on Twitter @riferrei or via his blog

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

For conference registration information and customer service

For more information on community discounts and trade opportunities with O’Reilly conferences

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of Strata Data Conference contacts