Build Systems that Drive Business
June 11–12, 2018: Training
June 12–14, 2018: Tutorials & Conference
San Jose, CA

Metrics are not enough: Monitoring Apache Kafka

Gwen Shapira (Confluent), Xavier Léauté (Confluent)
11:25am–12:05pm Wednesday, June 13, 2018
Monitoring, Observability, and Performance
Location: LL21 A/B Level: Beginner
Secondary topics: Systems Monitoring & Orchestration
Average rating: ****.
(4.14, 7 ratings)

Prerequisite knowledge

  • A working knowledge of Apache Kafka

What you'll learn

  • Explore best practices for monitoring Apache Kafka


When you’re running systems in production, clearly you want to make sure they are up and running at all times. But in a distributed system such as Apache Kafka, what does “up and running” even mean? Experienced Apache Kafka users know what is important to monitor, which alerts are critical, and how to respond to them. They don’t just collect metrics; they go the extra mile and use additional tools to validate availability and performance on both the Kafka cluster and their entire data pipelines.

Gwen Shapira and Xavier Léauté share best practices for monitoring Apache Kafka, discussing critical metrics, “worst practices” (common mistakes that you should avoid), what metrics don’t tell you, and how to cover these essential gaps. You’ll discover which metrics are critical to alert on, which are useful in troubleshooting, and what may actually misleading.

Photo of Gwen Shapira

Gwen Shapira


Gwen Shapira is a system architect at Confluent, where she helps customers achieve success with their Apache Kafka implementations. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. Gwen currently specializes in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, the coauthor of Hadoop Application Architectures, and a frequent presenter at industry conferences. She is also a committer on Apache Kafka and Apache Sqoop. When Gwen isn’t coding or building data pipelines, you can find her pedaling her bike, exploring the roads and trails of California and beyond.

Photo of Xavier Léauté

Xavier Léauté


Xavier Léauté is a software engineer at Confluent, where he is responsible for analytics infrastructure, including real-time analytics in Kafka Streams. Previously, Xavier was a quantitative researcher at BlackRock and served in various research and analytics roles at Barclays Global Investors and MSCI. He holds an MEng in operations research from Cornell University and a master’s degree in engineering from École Centrale Paris.

Comments on this page are now closed.


Picture of Eric Bach
06/18/2018 5:11am PDT

I found your presentation very useful, as we are looking to venture into usage of Kafka and wanted to go in with our eyes wide open.
When do you expect your content to be available for download?