Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Architecting distributed systems for failure: How Druid guarantees data availability

Fangjin Yang (Imply)
2:40pm–3:20pm Thursday, 03/31/2016
Data Innovations

Location: LL21 E/F
Tags: real-time
Average rating: ***..
(3.25, 4 ratings)

Prerequisite knowledge

Attendees should have a basic understanding of distributed systems.


Running distributed systems in production can be tremendously challenging. Fangjin Yang covers common problems and failures with distributed systems and discusses design patterns that can be used to maintain data integrity and availability when everything goes wrong. Fangjin uses Druid as a real-world case study of how these patterns are implemented in an open source technology.

Attendees will learn firsthand about the multitude of software, hardware, network, and data-center problems that can arise with running distributed systems and the features that are required for availability and survivability. To provide real-world examples, Fangjin examines the architecture of Druid and demonstrates how the system is designed to power applications that need to be up 24/7. Fangjin also covers common pitfalls with running distributed systems in various environments, including the tradeoffs with on-premises and cloud deployments.

Fangjin outlines best practices around properly instrumenting monitoring and alerting for distributed systems, examines various open source technologies that can be used for efficient monitoring, and explains how these technologies can be used to maintain the availability of your cluster. By the end of this session, you’ll be able to better use, design, and monitor your data systems.

Photo of Fangjin Yang

Fangjin Yang


Fangjin Yang is a coauthor of the open source Druid project and a cofounder of Imply, a data analytics startup based in San Francisco. Previously, Fangjin held senior engineering positions at Metamarkets and Cisco Systems. Fangjin has a BASc in electrical engineering and an MASc in computer engineering from the University of Waterloo, Canada.