Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Putting Kafka into overdrive

Todd Palino (LinkedIn), Gwen Shapira (Confluent)
5:10pm–5:50pm Wednesday, 03/30/2016
Data Innovations

Location: 210 C/G
Tags: real-time
Average rating: ****.
(4.62, 13 ratings)

Prerequisite knowledge

Attendees should have an understanding of how publish/subscribe messaging systems work, as well as basic knowledge of Apache Kafka. While a deep understanding of how Kafka works is not required, the more advanced the attendee is, the more immediately applicable the content will be.

Description

Apache Kafka lies at the heart of the largest data pipelines, handling trillions of messages and petabytes of data every day. Learn the right approach for getting the most out of Kafka from the experts at LinkedIn and Confluent. Todd Palino and Gwen Shapira demonstrate how to monitor, optimize, and troubleshoot performance of your data pipelines—from producer to consumer, development to production—as they explore some of the common problems that Kafka developers and administrators encounter when they take Apache Kafka from a proof of concept to production usage. Too often, systems are overprovisioned and underutilized and still have trouble meeting reasonable performance agreements.

Topics include:

  • What latencies and throughputs you should expect from Kafka
  • How to select hardware and size components
  • What you should be monitoring
  • Design patterns and antipatterns for client applications
  • How to go about diagnosing performance bottlenecks
  • Which configurations to examine and which ones to avoid
Photo of Todd Palino

Todd Palino

LinkedIn

Todd Palino is a site reliability engineer at LinkedIn tasked with keeping Zookeeper, Kafka, and Samza deployments fed and watered. His days are spent, in part, developing monitoring systems and tools to make that job a breeze. Previously, Todd was a systems engineer at Verisign, where he developed service-management automation for DNS, networking, and hardware management and managed hardware and software standards across the company.

Photo of Gwen Shapira

Gwen Shapira

Confluent

Gwen Shapira is a system architect at Confluent, where she helps customers achieve success with their Apache Kafka implementations. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. Gwen currently specializes in building real-time reliable data-processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, the coauthor of Hadoop Application Architectures, and a frequent presenter at industry conferences. She is also a committer on Apache Kafka and Apache Sqoop. When Gwen isn’t coding or building data pipelines, you can find her pedaling her bike, exploring the roads and trails of California and beyond.

Comments on this page are now closed.

Comments

Srikanth Ch
04/07/2016 12:02am PDT

Hello I have a question!
Basically I didn’t follow this conference as I don’t know when it when it was happened!!
I am to new Apache Kafka
I have been learning and testing from quite a period.
I want to explore more and trying to work on zookeeper and Kafka.
So will this conference helps me!

Picture of Todd Palino
Todd Palino
03/31/2016 3:03am PDT

The slides have been uploaded, but I’m not sure how long they will take to show up. I have also posted them at Slideshare: http://www.slideshare.net/ToddPalino/putting-kafka-into-overdrive