Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY

Low Latency Streaming: Twitter Heron on Infiniband

Karthik Ramasamy (Streamlio), Supun Kamburugamuve (Indiana University)
5:25pm6:05pm Wednesday, September 27, 2017
Stream processing and analytics
Location: 1E 07/08 Level: Beginner
Secondary topics:  Financial services, Media, Streaming

Who is this presentation for?

Software Engineers, Engineering Management, CIOs, Technology leaders

Prerequisite knowledge

A basic understanding of streaming analytics is helpful but not required.

What you'll learn

Attendees will come away with an overview of Heron as well as an understanding of how Heron can achieve low latency with various optimizations and new transport.

Description

Today’s enterprises are not only producing data in high volume but also at high velocity. With velocity comes the need to process the data in real time. To meet the real time needs, we developed and deployed Heron, the next generation streaming engine at Twitter. Heron processes billions and billions of events per day at Twitter and has been in production for nearly 3 years. Heron provides unparalleled performance at large scale and has been successfully meeting Twitter’s strict performance requirements for various streaming applications. Heron is a open source project with several major contributors from various institutions. Twitter Heron was ported to high performance computing (HPC) clusters with advanced processors, memory, IO systems, and high performance interconnects. High performance interconnects such as Infiniband, Omnipath and Cray XC networks feature sub-millisecond latencies and large bandwidths along with advanced messaging capabilities compared to Ethernet at a comparative price. Large scale distributed streaming applications can benefit from the low latencies and high bandwidths offered by these networks especially in financial and iOT industries. In this talk, the speakers will explain how they integrated Infiniband high performance interconnect with Twitter Heron and optimized for achieving low latency and high throughput stream processing. Our experiments show that we can achieve latencies as low as 7ms and throughputs around 170M tuples/sec with minimal resources.

Photo of Karthik Ramasamy

Karthik Ramasamy

Streamlio

Karthik Ramasamy is the co-founder of Streamlio that focuses on building next generation real time processing engines. Before Streamlio, he was the engineering manager and technical lead for real-time analytics at Twitter where he co-created Twitter Heron. He has two decades of experience working in parallel databases, big data infrastructure, and networking. He cofounded Locomatix, a company that specializes in real-time streaming processing on Hadoop and Cassandra using SQL, that was acquired by Twitter. Before Locomatix, he had a brief stint with Greenplum, where he worked on parallel query scheduling. Greenplum was eventually acquired by EMC for more than $300M. Prior to Greenplum, Karthik was at Juniper Networks, where he designed and delivered platforms, protocols, databases, and high availability solutions for network routers that are widely deployed on the internet. Before joining Juniper, at the University of Wisconsin he worked extensively in parallel database systems, query processing, scale out technologies, storage engines, and online analytical systems. Several of these research projects were later spun off as a company acquired by Teradata.

Karthik is the author of several publications, patents, and Network Routing: Algorithms, Protocols and Architectures. He has a Ph.D. in computer science from the University of Wisconsin, Madison with a focus on big data and databases.

Photo of Supun Kamburugamuve

Supun Kamburugamuve

Indiana University

Supun Kamburugamuve is a computer science Ph.D. Candidate at Indiana University, USA. His research is based on big data applications and frameworks especially focusing on data streaming for real-time data analytics. He is an Apache Software Foundation member and has been contributing to many open source projects including Apache Web Services projects. For his Ph.D., Supun is focusing on large scale machine learning algorithms, data streaming algorithms for robots in the cloud and large scale data visualizations. Recently he has been working on high-performance enhancements to big data systems with HPC interconnect such as Infiniband and Omnipath. Before joining Indiana University, Supun worked on middleware systems and was a key member of developing an open source enterprise service bus which is being used widely for enterprise integrations.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)