Over the last few years, streaming platform Apache Kafka has been used extensively for real-time data collecting, delivering, and processing—particularly in the enterprise. Companies like LinkedIn are now sending more than a trillion messages per day to Kafka. Many companies (e.g., financial institutions) are now storing mission-critical data in Kafka.
Jun Rao leads a deep dive into some of the key internals that help make Kafka popular and provide strong reliability guarantees. You’ll learn about the underlying design in Kafka that leads to such high throughput and how Kafka supports high reliability through its built-in replication mechanism. One common use case of Kafka is propagating updatable database records. Jun explains how a unique Kafka feature called compaction is designed to solve just this kind of problem more naturally.
Jun Rao is the cofounder of Confluent, a company that provides a streaming data platform on top of Apache Kafka. Previously, Jun was a senior staff engineer at LinkedIn, where he led the development of Kafka, and a researcher at IBM’s Almaden research data center, where he conducted research on database and distributed systems. Jun is the PMC chair of Apache Kafka and a committer of Apache Cassandra.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org