Stories from the Trenches: The Challenges of Building an Analytics Stack
Organizations often showcase the virtues of their data platforms, but rarely share the challenges and decisions faced along the way. Our session describes how we architected our analytics stack around Druid, an open source distributed data store, and how we overcame the challenges around scaling the system, balancing features with cost, and making performance consistent.
When we first designed Druid for problems of real-time data ingestion, fast queries, high availability, and multi-tenancy, there were numerous challenges we did not foresee. Over the last three years, we have evolved Druid and the entire analytics stack built around it to overcome a variety of these challenges. In the process, we have learned a great deal about understanding user query patterns, running nodes “in-memory”, managing failures, and properly monitoring systems.
Our Druid cluster now contains over a hundred terabytes of compressed data, representing trillions of rows, and we maintain a 95th percentile query latency of less than a second. We also ingest several terabytes of new data per hour. While scaling our cluster, we found what properties led to consistent performance and systematically applied them to better manage memory, improve stability, and maintain consistency.
We hope this session will inform anyone who faces similar problems of potential ways of solving them.
Fangjin is one of the first developers to Metamarkets and the Druid project. He mainly works on core infrastructure development. Fangjin comes to Metamarkets from Cisco where he developed diagnostic algorithms for various routers and switches. He holds a BASc in Electrical Engineering and a MASc in Computer Engineering from the University of Waterloo, Canada.
One of the first engineers to Metamarkets’ analytics team, Xavier is responsible for analytics infrastructure, including real-time analytics in Druid. He was previously a quantitative researcher at BlackRock. Prior to that, he held various research and analytics roles at Barclays Global Investors and MSCI. He holds an MEng in Operations Research from Cornell University and a Masters in Engineering from École Centrale Paris.