Presented By O'Reilly and Cloudera
Make Data Work
Sept 29–Oct 1, 2015 • New York, NY

Many streams lead to Kafka - An event data workshop

Jesse Anderson (Big Data Institute), Ewen Cheslack-Postava (Confluent)
9:00am–12:30pm Tuesday, 09/29/2015
Data Innovations
Location: 3D 04/09 Level: Intermediate
Average rating: ***..
(3.25, 12 ratings)
Slides:   1-PDF 

Materials or downloads needed in advance

Bring your own laptop with a 64-bit CPU, at least 3 GB free RAM, and VirtualBox 4.3.x installed.

Please make sure to download and decompress the Virtual Machine image found at BEFORE arriving onsite.


See how Kafka can help you harness the value of stream data in your organization! During this three-hour tutorial we’ll discuss what Kafka is, and its emerging critical role in the modern data management and distribution pipeline.

We’ll also discuss key architectural concepts and developer APIs. The tutorial includes hands-on labs where you’ll build an application that can publish data to Kafka, and subscribe to receive data from Kafka.

Here’s the high-level agenda for the tutorial:

  • Introduction to what Kafka is, its capabilities, and major components
  • Types of data appropriate for Kafka
  • Producers, consumers, and brokers and their roles in a Kafka cluster
  • Developer APIs in various languages for publication/subscription to Kafka Topics
  • Common patterns for application development with Kafka

This tutorial is ideal for application developers, extraction-transformation-load (ETL) developers, or data scientists who need to interact with Kafka clusters as a source of, or destination for, stream data.

Photo of Jesse Anderson

Jesse Anderson

Big Data Institute

Jesse Anderson is a data engineer, creative engineer, and managing director of the Big Data Institute. Jesse trains employees on big data—including cutting-edge technology like Apache Kafka, Apache Hadoop, and Apache Spark. He’s taught thousands of students at companies ranging from startups to Fortune 100 companies the skills to become data engineers. He’s widely regarded as an expert in the field and recognized for his novel teaching practices. Jesse is published by O’Reilly and Pragmatic Programmers and has been covered in such prestigious media outlets as the Wall Street Journal, CNN, BBC, NPR, Engadget, and Wired. You can learn more about Jesse at

Photo of Ewen Cheslack-Postava

Ewen Cheslack-Postava


Ewen Cheslack-Postava is an engineer at Confluent building a stream data platform based on Apache Kafka to help organizations reliably and robustly capture and leverage all their real-time data. Ewen received his PhD from Stanford University, where he developed Sirikata, an open source system for massive virtual environments. His dissertation defined a novel type of spatial query giving significantly improved visual fidelity and described a system for efficiently processing these queries at scale.

Comments on this page are now closed.


Matthew Lurie
09/29/2015 6:34am EDT

Update: the files can be found on the USB drives floating around as well as this link:

clifton liu
09/29/2015 5:16am EDT

I can’t find the information either

Matthew Lurie
09/29/2015 5:09am EDT

I have the VM, but where can I find the slides and exercises?