In the past few years, Apache Kafka has established itself as the world’s most popular real-time, large-scale messaging system. Kafka has quickly become a mission-critical infrastructure component for modern data platforms and is used across a wide range of industries by thousands of companies, including Netflix, Cisco, PayPal, and Twitter.
The latest addition to the Apache Kafka project is Kafka Streams, a new stream processing library natively integrated with Kafka. It has a very low barrier to entry, easy operationalization, and a natural DSL for writing stream processing applications. As such, it is the most convenient yet scalable option to analyze, transform, or otherwise process data that is backed by Kafka. Neha Narkhede offers an overview of Kafka Streams, covering its design and API, typical use cases, code examples, and its upcoming roadmap. Neha also compares Kafka Streams’s lightweight library approach with heavier, framework-based tools such as Spark Streaming or Storm, which require you to understand and operate a whole different infrastructure for processing real-time data in Kafka.
Neha Narkhede is the cofounder and CTO at Confluent, a company backing the popular Apache Kafka messaging system. Previously, Neha led streams infrastructure at LinkedIn, where she was responsible for LinkedIn’s petabyte-scale streaming infrastructure built on top of Apache Kafka and Apache Samza. Neha specializes in building and scaling large distributed systems and is one of the initial authors of Apache Kafka. A distributed systems engineer by training, Neha works with data scientists, analysts, and business professionals to move the needle on results.
©2016, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.