Presented By O'Reilly and Cloudera
Make Data Work
Sept 29–Oct 1, 2015 • New York, NY

Twitter Heron: Stream processing at scale

Karthik Ramasamy (Streamlio)
2:05pm–2:45pm Wednesday, 09/30/2015
IoT & Real-time
Location: 3D 02/11
Average rating: ***..
(3.93, 14 ratings)

Storm has long served as the main platform for real-time analytics at Twitter. However, as the scale of data being processed in real time at Twitter has increased, along with an increase in the diversity and the number of use cases, many limitations of Storm have become apparent. We needed a system that scaled better, had better debug-ability, better performance, and was easier to manage – all while working in a shared cluster infrastructure.

We considered various alternatives to meet these needs, and in the end concluded that we needed to build a new real-time stream data processing system. This talk will present the design and implementation of a new system, called Heron, that is now the de facto stream data processing engine inside Twitter. Share our experiences in running Heron in production.

Photo of Karthik Ramasamy

Karthik Ramasamy

Streamlio

Karthik Ramasamy is the cofounder of Streamlio, a company building next-generation real-time processing engines. Karthik has more than two decades of experience working in parallel databases, big data infrastructure, and networking. Previously, he was engineering manager and technical lead for real-time analytics at Twitter, where he was the cocreator of Heron; cofounded Locomatix, a company that specialized in real-time stream processing on Hadoop and Cassandra using SQL (acquired by Twitter); briefly worked on parallel query scheduling at Greenplum (acquired by EMC for more than $300M); and designed and delivered platforms, protocols, databases, and high-availability solutions for network routers at Juniper Networks. He is the author of several patents, publications, and one best-selling book, Network Routing: Algorithms, Protocols, and Architectures. Karthik holds a PhD in computer science from UW Madison with a focus on databases, where he worked extensively in parallel database systems, query processing, scale-out technologies, storage engines, and online analytical systems. Several of these research projects were spun out as a company later acquired by Teradata.

Comments on this page are now closed.

Comments

peng wang
10/25/2015 9:02am EDT

Can you please share the slides? My email is pengwang5@yahoo.com

Picture of Karthik Ramasamy
Karthik Ramasamy
10/20/2015 10:46am EDT

We used Twitter version of Storm forked in Nov 2013 for performance

Bojan Joveski
10/20/2015 9:02am EDT

Can you please share the slides? My email is bjoveski@gmail.com

Sam Glover
10/16/2015 3:37am EDT

Hi Karthik,

Thanks for your presentation. Can you also send me the slides. Thanks!!

samglo@att.net

Vinod Porwal
10/08/2015 6:47am EDT

please share the slides. my email id is – vinod.porwal@hotmail.com

Picture of Gerardo Bodegas Martinez
Gerardo Bodegas Martinez
10/01/2015 4:45am EDT

Sure, my email is g_bodegas@hotmail.com

Picture of Karthik Ramasamy
Karthik Ramasamy
09/30/2015 7:15pm EDT

If you could share the email address I can send the slides.

Saad Shams
09/30/2015 1:35pm EDT

which storm version was used in performance comparirson of storm with heron ?

Picture of Gerardo Bodegas Martinez
Gerardo Bodegas Martinez
09/30/2015 12:20pm EDT

slides?