Apache Kafka lies at the heart of the largest data pipelines, handling trillions of messages and petabytes of data every day. Gwen Shapira and Todd Palino explain the right approach for getting the most out of Kafka, exploring how to monitor, optimize, and troubleshoot performance of your data pipelines from producer to consumer and from development to production.
Gwen and Todd outline some of the common problems that Kafka developers and administrators encounter when they take Apache Kafka from a proof of concept to production usage. Too often, these systems are overprovisioned and underutilized and still have trouble meeting reasonable performance agreements.
Gwen Shapira is a system architect at Confluent, where she helps customers achieve success with their Apache Kafka implementations. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. Gwen currently specializes in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, the coauthor of Hadoop Application Architectures, and a frequent presenter at industry conferences. She is also a committer on Apache Kafka and Apache Sqoop. When Gwen isn’t coding or building data pipelines, you can find her pedaling her bike, exploring the roads and trails of California and beyond.
Todd Palino is a site reliability engineer at LinkedIn tasked with keeping Zookeeper, Kafka, and Samza deployments fed and watered. His days are spent, in part, developing monitoring systems and tools to make that job a breeze. Previously, Todd was a systems engineer at Verisign, where he developed service-management automation for DNS, networking, and hardware management and managed hardware and software standards across the company.
©2016, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.