Presented By O'Reilly and Cloudera
December 5-6, 2016: Training
December 6–8, 2016: Tutorials & Conference
Singapore

Twitter's real-time stack: Processing billions of events with Heron and DistributedLog

Maosong Fu (Twitter)
12:05pm–12:45pm Wednesday, December 7, 2016
Average rating: **...
(2.00, 3 ratings)

What you'll learn

  • Explore the end-to-end real-time stack Twitter designed in order to analyze events in real time

Description

Twitter generates billions and billions of events per day. Analyzing these events in real time presents a massive challenge. Maosong Fu offers an overview of the end-to-end real-time stack Twitter designed in order to meet this challenge, consisting of DistributedLog (the distributed and replicated messaging system) and Heron (the streaming system for real-time computation).

DistributedLog—a replicated log service built on top of Apache BookKeeper that provides infinite, ordered, append-only streams that can be used for building robust real-time systems—is the foundation of Twitter’s publish-subscribe system. Heron is Twitter’s next-generation streaming system built from ground up to address its scalability and reliability needs. Both systems have been in production for nearly two years and are widely used at Twitter in a range of diverse applications, such as the search ingestion pipeline, ad analytics, image classification, and more.

Maosong describes Heron and DistributedLog in detail, covering use cases and sharing the operating experiences and challenges of running large-scale real-time systems at scale.

Photo of Maosong Fu

Maosong Fu

Twitter

Maosong Fu is the technical lead for ​Heron and ​real-time analytics at Twitter and the author of ​few publications in the distributed area​. Maosong holds a master’s degree from Carnegie Mellon University and bachelor’s from Huazhong University of Science and Technology.