Presented By O'Reilly and Cloudera
Make Data Work
31 May–1 June 2016: Training
1 June–3 June 2016: Conference
London, UK

A hands-on introduction to Apache Kafka

Ian Wrigley (StreamSets)
9:00–12:30 Wednesday, 1/06/2016
Data innovations
Location: Capital Suite 14 Level: Intermediate
Average rating: ****.
(4.52, 21 ratings)

Prerequisite knowledge

Attendees should have basic familiarity with the Linux command line and Java or Python, although sample solutions will also be provided for those who are not developers. No prior knowledge of Kafka is required.

Materials or downloads needed in advance

The tutorial includes some hands-on exercises. If you want to follow along, you'll need a laptop with at least 4 GB of RAM and VirtualBox installed. Once you have installed VirtualBox, please download the virtual machine and Exercise Manual. Note that your laptop must be capable of running a 64-bit guest virtual machine; the easiest way to test this is to download the VM, launch it (double-click the .vbox file), and ensure it starts up. If it does not start properly, check your machine’s BIOS and ensure that VT-x is enabled.


Ian Wrigley leads a hands-on workshop on leveraging the capabilities of Apache Kafka to collect, manage, and process stream data for both big data projects and general-purpose enterprise data integration, covering key architectural concepts, developer APIs, use cases, and how to write applications that publish data to, and subscribe to data from, Kafka. Ian offers an overview of Kafka, explains how it works, and demonstrates how use it to build modern data applications, using hands-on exercises where you’ll build an application that can to publish data to Kafka and subscribe to receive data from Kafka. This tutorial is ideal for application developers, ETL (extract, transform, load) developers, or data scientists who need to interact with Kafka clusters as a source of, or destination for, stream data.

Topics include:

  • An introduction to Kafka, its capabilities, and major components
  • Types of data appropriate for Kafka
  • Producers, consumers, and brokers and their roles in a Kafka cluster
  • Developer APIs in various languages for publication/subscription to Kafka Topics
  • Common patterns for application development with Kafka
Photo of Ian Wrigley

Ian Wrigley


Ian Wrigley is a Technical Director at StreamSets, the company behind the industry’s first data operations platform. Over his 25-year career, Ian has taught tens of thousands of students subjects ranging from C programming to Hadoop development and administration.

Comments on this page are now closed.


Picture of Anahita Saghafi Saghafi
Anahita Saghafi Saghafi
1/06/2016 10:36 BST

Is there a URL to download the slides please?

Leonardo Scrugli
1/06/2016 10:33 BST

hi, where i can download the presentation ?
you can let me a link?

Picture of Ian Wrigley
Ian Wrigley
27/05/2016 23:01 BST

My apologies; if you’re having issues extracting the archive on a Windows machine, please try the new version (the link is the same), which was uploaded on Friday at 10PM UK time. If the standard Windows unzip utility doesn’t extract it, use WinZip.

Michael Raths
27/05/2016 9:25 BST

It’s not possible to unzip the virtual machine (.vdi file).
It results in CSC Checksum error.