Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Kafka in jail: Running Kafka in container orchestrated clusters

Sean Glover (Lightbend)
16:3517:15 Wednesday, 23 May 2018

Who is this presentation for?

  • Operations engineers, software engineers, data engineers, and those in developer operations

Prerequisite knowledge

  • A basic understanding of Kafka

What you'll learn

  • Learn how to run and monitor Kafka in containerized environments like Mesos and Kubernetes and scale and move brokers in mixed-use clusters
  • Understand Kafka configurations relevant to mixed-use clusters and persistence abstractions available in Mesos and Kubernetes

Description

Kafka is best suited to run close to the metal on dedicated machines in statically defined clusters, but these fixed clusters are quickly becoming extinct. Companies want to create mixed-use clusters that take advantage of every resource available. Stateless, transient services fit well into this model, but stateful services each have their own particular needs. Disk is one of Kafka’s most important resource requirements to provide message durability, but what is the best way to provide disk resources to stateful technologies while in a mixed-use cluster?

Sean Glover offers an overview of leading Kafka implementations on DC/OS and Kubernetes to explore how reliably they run Kafka in container orchestrated clusters and detail the pros and cons of containerizing Kafka brokers relative to installing directly on the host platform. Static clusters require greater operational knowhow to do common tasks with Kafka, such as applying broker configuration updates, upgrading to a new version, and adding or decommissioning brokers. By using Kafka implementations on DC/OS (Apache Mesos) and Kubernetes, you can reduce the overhead for a number of common operational tasks with standard cluster resource manager features. You’ll learn how to accommodate for Kafka-specific operational logic in the form of a Kafka cluster helper application known as a scheduler in Mesos and controller in Kubernetes and discover some of the pitfalls of such an approach, including how to manage broker storage effectively and the additional burden of monitoring scheduler or controller-based Kafka cluster help applications.

You also learn how to use modern container orchestration tooling to find the right balance between statically defined clusters and elasticity within a larger mixed-use clusters. In mixed-use clusters best practice often dictates that stateful applications are sticky to the host they’re running on because that state exists on local disk. However there may be scenarios where using a distributed block storage solution may be acceptable, which would allow brokers to have some sense of mobility when there’s a need. Sean outlines the implications of using distributed block storage devices and the performance trade-offs in common failure or operational scenarios, such as when a broker needs to be replaced and topic partitions must be rebalanced.

Kafka is an integral part of the Lightbend Fast Data Platform, the next-generation stream processing system. Join in to see how to best implement operational Kafka with container orchestration tools on public cloud services.

Photo of Sean Glover

Sean Glover

Lightbend

Sean Glover is a software engineer specializing in Apache Kafka and its ecosystem on the Fast Data Platform team at Lightbend, which is building a next-generation big data platform distribution with a focus on stream processors, machine learning, and operations ease of use. Sean has several years’ experience helping Global 5,000 companies build data streaming platforms using technologies such as Kafka, Spark, and Akka.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)