Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY

One cluster does not fit all: Architecture patterns for multicluster Apache Kafka deployments

Gwen Shapira (Confluent)
2:05pm2:45pm Thursday, September 28, 2017
Data engineering, Data Engineering & Architecture
Location: 1E 07/08 Level: Intermediate
Average rating: ***..
(3.33, 3 ratings)

Who is this presentation for?

  • Data architects, DevOps engineers, or anyone deploying Kafka and wondering how many clusters they really need

Prerequisite knowledge

  • Basic knowledge of Apache Kafka

What you'll learn

  • Explore Apache Kafka features for multitenant clusters
  • Learn how to run a single Kafka cluster in multiple data centers (and when this is a good idea)
  • Understand how to synchronize multiple clusters effectively for active-active, failover, and analytics use cases


In the last year, multicluster and cross-data center deployments of Apache Kafka have become the norm rather than an exception. The reasons are many and include:

  • Different groups in the same company using Kafka in different ways;
  • Collecting information from many geographical regions and branches to a centralized analytics cluster;
  • Planning for cases where an entire cluster or data center is not available;
  • Using Kafka to assist in cloud migration.

Gwen Shapira offers an overview of several use cases, including real-time analytics and payment processing, that may require multicluster solutions and discusses real-world examples with their specific requirements. Gwen outlines the pros and cons of several common architecture patterns, including multitenant Kafka clusters, active-active multiclusters, failover clusters, stretching a single cluster between multiple data centers, and using Kafka to bridge between clouds or between on-premises and the cloud. Along the way, she explores the features of Apache Kafka and demonstrates how to use this understanding of Kafka to choose the right architecture for use cases from the financial, retail, and media industries.

Photo of Gwen Shapira

Gwen Shapira


Gwen Shapira is a system architect at Confluent, where she helps customers achieve success with their Apache Kafka implementations. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. Gwen currently specializes in building real-time reliable data-processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, the coauthor of Hadoop Application Architectures, and a frequent presenter at industry conferences. She is also a committer on Apache Kafka and Apache Sqoop. When Gwen isn’t coding or building data pipelines, you can find her pedaling her bike, exploring the roads and trails of California and beyond.