Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Real-time monitoring of Twitter's network infrastructure with Heron

J Delange (Twitter), N Lu (Twitter)
3:50pm4:30pm Thursday, March 28, 2019
Average rating: **...
(2.67, 3 ratings)

Who is this presentation for?

  • Software engineers and managers

Level

Beginner

Prerequisite knowledge

  • A basic understanding of streaming systems and storage systems (useful but not required)

What you'll learn

  • Understand the challenges in building data pipeline faced by engineers unfamiliar with data science
  • Explore the technologies Twitter used to build a large-scale data pipeline that ingests more than 1 billion tuples a day
  • Learn how to use data science to analyze and troubleshoot large-scale infrastructures

Description

Twitter users start and join multiple conversations every day, so monitoring and mitigating any potential network issue in real time is the key to a smooth user experience. Monitoring and analyzing a network at Twitter scale is extremely hard—and almost impossible to achieve manually. As a result, the company needed to create dedicated data pipelines to monitor network activity, detect potential issues, and generate network usage reports.

Julien Delange and Neng Lu explain how Twitter uses the Heron stream processing engine to monitor and analyze its network infrastructure. Within two months, engineers without intensive data science knowledge implemented a new data pipeline that ingests large-scale network events and reports networking issue and network equipment utilization. This pipeline ingests multiple data sources and processes about 1 billion tuples per day to detect network issues and generate usage statistics. As of today, this data pipeline is deployed in production and helps network engineers to detect issues, troubleshoot network usage, and manage network capacity.

Join Julien and Neng for an overview of the Heron project. Along the way, you’ll learn the challenges Twitter faced when building it and the key technologies it used to overcome those challenges and achieve its scalability, low-maintenance, and low-latency goals.

Photo of J Delange

J Delange

Twitter

Julien Delange is a staff software engineer at Twitter working on infrastructure services. Previously, he was a senior software engineer at Amazon Web Services, a senior member of the technical staff at Carnegie Mellon University, and a software engineer at the European Space Agency. Julien holds a PhD in computer science from Télécom ParisTec and a master’s degree in computer science from Université Pierre-et-Marie-Curie.

Photo of N Lu

N Lu

Twitter

Neng Lu is a software engineer at Twitter, where he is the core member of Twitter’s real-time compute team and the core committer to the Apache Heron project (incubating). He has a broad interest in distributed systems and real-time analytics and has worked on Twitter’s key-value storage system, Manhattan; its monitoring system, Cuckoo; and its real-time processing system, Heron. He holds an MS in CS from UCLA and a bachelor’s degree in CS from Zhejiang University.