At Netflix, the big data platform is the foundation for analytics that drive all product decisions that directly impact our customer experience. As for scale, it is one of the top three largest services running at Netflix, in terms of compute power and data size.
In this talk, we will take the audience through a journey to understand how we scale the platform to handle the increasing amount of data (over 400 billion events generated daily), the increasing demand of analytics (which translates to compute power), and the increasing number of users dependent on our platform to make business decisions.
Specifically, we will talk about how we built this architecture; which architectural choices we made along the way; and the challenges we faced:
Overall, you will learn about our open source-powered big data architecture in the AWS cloud, and how we build out the technology stack that comprises the big data platform at Netflix today.
Eva Tse leads the Big Data Platform team at Netflix. Her team architects and manages the Netflix big data platform in the AWS cloud. The platform is leveraged across Netflix for data analytics and ETL. The technology stack includes various open source projects (e.g., Pig, Hive, Presto, Parquet, Hadoop) and Netflix open-sourced tools and services (e.g., Genie, Lipstick, Inviso). Prior to joining Netflix, Eva led the server and metadata service teams for PowerCenter at Informatica. Eva holds an MS and BS in computer science from the University of Houston.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org