Presented By O'Reilly and Cloudera
Make Data Work
22–23 May 2017: Training
23–25 May 2017: Tutorials & Conference
London, UK

Real-time data pipelines with Apache Kafka

Tim Berglund (Confluent)
13:3017:00 Tuesday, 23 May 2017
Stream processing and analytics
Location: Capital Suite 4
Level: Intermediate
Average rating: ***..
(3.50, 2 ratings)

Who is this presentation for?

  • Developers, data scientists, and anyone who wants to learn more about setting up and running Apache Kafka to build real-time data pipelines

Prerequisite knowledge

  • Basic knowledge of Apache Kafka

Materials or downloads needed in advance

  • A laptop with at least 4 GB of RAM and VirtualBox and the VM installed.
  • Please download this virtual machine for Tuesday's workshop *BEFORE* you arrive onsite. You will also need Virtual Box installed on your computer to make sure of the VM.

What you'll learn

  • Learn how to configure Kafka Connect to move data between external systems and Apache Kafka and write a real-time stream processing application using the Kafka Streams DSL
  • Understand how easy it is to scale Connect and Streams as your data volume increases


Tim Berglund demonstrates how to use Kafka Connect and Kafka Streams to build real-world, real-time streaming data pipelines—using Kafka Connect to ingest data from a relational database into Kafka topics as the data is being generated and then using Kafka Streams to process and enrich the data in real time before writing it out for further analysis.

You’ll learn how easy it is to use Connect to ingest and export data (no code required), and how the Kafka Streams domain-specific language (DSL) means that developers can concentrate on business logic without worrying about the low-level plumbing of streaming data processing. And because Streams is a Java library, developers can build real-time applications without needing a separate cluster to run an external stream processing framework.

Photo of Tim Berglund

Tim Berglund


Tim Berglund is the senior director of developer experience with Confluent, where he serves as a teacher, author, and technology leader. Tim can frequently be found speaking at conferences internationally and in the United States. He’s the copresenter of various O’Reilly training videos on topics ranging from Git to distributed systems and is the author of Gradle Beyond the Basics. He tweets as @tlberglund, blogs very occasionally at, and is the cohost of the DevRel Radio podcast. He lives in Littleton, Colorado, with the wife of his youth and their youngest child, the other two having mostly grown up.

Comments on this page are now closed.


Lescop Celine | LEAD ARCHITECT
22/05/2017 12:38 BST

I mean the virtual machine to load in virtual box.
My email :

Lescop Celine | LEAD ARCHITECT
22/05/2017 12:37 BST

Hi Tim, I will attend this session tomorrow. Where can I find the virtualbox we are supposed to use tomorrow ?
Many thanks,