Skip to main content

Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Source Libraries

Sudhir Tonse (Netflix), Danny Yuan (Netflix Inc)
Computational Thinking
Portland 255
Average rating: ****.
(4.40, 5 ratings)
Slides:   1-PDF 

The amount of data collected by companies has been steadily exploding. Companies increasingly are using big data as a competitive cornerstone. Netflix has always been a believer in the power of using data-driven analytics to improve its operational excellence, and to drive key product features.

This session will walk the attendees through the architecture of the data platform including its data pipeline at Netflix. The attendees will see how they can stand up a platform for large-scale data collection and analysis using a number of open source software, such as Suro and related software under the umbrella of NetflixOSS, as well as other open source software such as Apache Kafka, ElasticSearch, and Druid from Metamarkets.

The technical domains covered include log event generation and collection, messaging, real-time event processing, and OLAP. We will also walk attendees through use cases to show how to explore business and product metrics to gain operational insights.

Photo of Sudhir Tonse

Sudhir Tonse


Sudhir Tonse manages the Cloud Platform Infrastructure team at Netflix and is responsible for many of the services and components that form the Netflix Cloud Platform as a Service.

Many of these components have been open sourced under the NetflixOSS umbrella. Open source contribution includes Archaius: a dynamic configuration/properties management library, Ribbon: a Inter Process Communications framework that includes Cloud friendly Software load balancers, Karyon: the nucleus of a PaaS service etc.
Prior to Netflix, Sudhir was an Architect at Netscape/AOL delivering large-scale consumer and enterprise applications in the area of Personalization, Infrastructure and Advertising Solutions.

Sudhir is a weekend golfer and tries to make the most of the wonderful California weather and public courses.

Photo of Danny Yuan

Danny Yuan

Netflix Inc

Danny is an architect and software developer in Netflix’s Platform Engineering team. He works on Netflix’s distributed crypto service, data pipeline, and real-time analytics. He is the owner of Netflix’s open sourced data pipeline, Suro, and also the owner of Netflix’s predictive autoscaling engine.