Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Kafka in jail: Running Kafka in container-orchestrated clusters

Sean Glover (Lightbend)
16:3517:15 Wednesday, 23 May 2018
Average rating: **...
(2.50, 2 ratings)

Who is this presentation for?

  • Operations engineers, software engineers, data engineers, and those in developer operations

Prerequisite knowledge

  • A basic understanding of Kafka

What you'll learn

  • Learn how to run and monitor Kafka in containerized environments like Mesos and Kubernetes and scale and move brokers in mixed-use clusters
  • Understand Kafka configurations relevant to mixed-use clusters and persistence abstractions available in Mesos and Kubernetes


Kafka is best suited to run close to the metal on dedicated machines in statically defined clusters, but these fixed clusters are quickly becoming extinct. Companies want to create mixed-use clusters that take advantage of every resource available. Stateless, transient services fit well into this model, but stateful services each have their own particular needs. Disk is one of Kafka’s most important resource requirements to provide message durability, but what is the best way to provide disk resources to stateful technologies while in a mixed-use cluster?

Sean Glover offers an overview of leading Kafka implementations on DC/OS and Kubernetes to explore how reliably they run Kafka in container-orchestrated clusters and detail the pros and cons of containerizing Kafka brokers relative to installing directly on the host platform. Static clusters require greater operational know-how to do common tasks with Kafka, such as applying broker configuration updates, upgrading to a new version, and adding or decommissioning brokers. By using Kafka implementations on DC/OS (Apache Mesos) and Kubernetes, you can reduce the overhead for a number of common operational tasks with standard cluster resource manager features. You’ll learn how to accommodate for Kafka-specific operational logic in the form of a Kafka cluster helper application known as a scheduler in Mesos and controller in Kubernetes and discover some of the pitfalls of such an approach, including how to manage broker storage effectively and the additional burden of monitoring scheduler or controller-based Kafka cluster help applications.

You also learn how to use modern container orchestration tooling to find the right balance between statically defined clusters and elasticity within a larger mixed-use clusters. In mixed-use clusters, best practice often dictates that stateful applications are sticky to the host they’re running on because that state exists on the local disk. However there may be scenarios where using a distributed block storage solution may be acceptable, which would allow brokers to have some sense of mobility when there’s a need. Sean outlines the implications of using distributed block storage devices and the performance trade-offs in common failure or operational scenarios, such as when a broker needs to be replaced and topic partitions must be rebalanced.

Kafka is an integral part of the Lightbend Fast Data Platform, the next-generation stream processing system. Join in to see how to best implement operational Kafka with container orchestration tools on public cloud services.

Photo of Sean Glover

Sean Glover


Sean Glover is a software engineer specializing in Apache Kafka and its ecosystem on the Fast Data Platform team at Lightbend, which is building a next-generation big data platform distribution with a focus on stream processors, machine learning, and operations ease of use. Sean has several years’ experience helping Global 5,000 companies build data streaming platforms using technologies such as Kafka, Spark, and Akka.

Comments on this page are now closed.


Picture of Sean Glover
30/05/2018 15:40 BST

Thanks for attending! Slides are available here: