Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Kafka in jail. Running Kafka in container orchestrated clusters.

Sean Glover (Lightbend)
16:3517:15 Wednesday, 23 May 2018

Who is this presentation for?

Developer Operations, Operations Engineer, Software Engineer, Data Engineer

Prerequisite knowledge

Basic understanding of Kafka.

What you'll learn

- Running and Monitoring Kafka in containerized environments like Mesos and Kubernetes - Scaling and moving brokers in mixed-use clusters - Kafka configuration relevant to mixed-use clusters - Persistence abstractions available in Mesos and Kubernetes.


Kafka is best suited to run close to the metal on dedicated machines in statically defined clusters, but these fixed clusters are quickly becoming extinct. Companies want to create mixed-use clusters that take advantage of every resource available. Stateless transient services fit well into this model, but stateful services each have their own particular needs. Disk is one of Kafka’s most important resource requirements to provide message durability, but what is the best way to do this while supporting service migration and dynamic resources?

We’ll explore the pros and cons of containerizing Kafka brokers relative to installing directly on the host platform. We’ll contrast several popular orchestrated Kafka implementations such as on DC/OS (Mesos) and Kubernetes. You will discover how we can use modern container orchestration tooling to find the right balance between statically defined clusters and elasticity within a larger mixed-use clusters.

Kafka is an integral part of the Lightbend Fast Data Platform, the next generation stream processing system. This talk shares our experience on how to best operational Kafka with container orchestration tools on public cloud services.

Photo of Sean Glover

Sean Glover


Sean is a Software Engineer on the Fast Data Platform team at Lightbend where he specializes in Apache Kafka and its ecosystem. The Fast Data Platform team is building the next generation big data platform distribution with a focus on stream processors, machine learning, and operations ease of use. Sean has several years experience consulting for Global 5000 companies and helping them build data streaming platforms using technologies such as Kafka, Spark, and Akka.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)