To manage the ever-increasing volume and velocity of data within your company, you may have successfully made the transition from single machines and one-off solutions to large, distributed stream infrastructures in your data center powered by Apache Kafka. But what’s to be done if one data center is not enough?
Ewen Cheslack-Postava explores resilient multi-data-center architecture with Apache Kafka, sharing best practices for data replication and mirroring as well as disaster scenarios and failure handling. Ewen covers four scenarios—replication and failover for disaster recovery, data produced in one location but consumed in another, aggregate cluster for data analysis, and bidirection relication—discussing the requirements for each, providing a proven architecture, and explaining the benefits and limitations of the solution.
Ewen Cheslack-Postava is an engineer at Confluent building a stream data platform based on Apache Kafka to help organizations reliably and robustly capture and leverage all their real-time data. Ewen received his PhD from Stanford University, where he developed Sirikata, an open source system for massive virtual environments. His dissertation defined a novel type of spatial query giving significantly improved visual fidelity and described a system for efficiently processing these queries at scale.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.