The Netflix Data Platform is a constantly evolving, large-scale infrastructure running in the (AWS) cloud. This talk will dive into what we’re up to and why. We are especially focused on performance and ease of use. We’ve upgraded to Hadoop 2, have partnered with the community developing Pig on Tez, have adopted the Parquet file format, and fully integrated Presto into our stack. We are exploring Spark for streaming, machine learning, and analytic use cases.
We continue to add to our big data, open source suite, with our latest contribution Inviso (which provides easy searching and visibility into Hadoop execution and performance). We are also heads-down developing a cohesive framework for easy platform interaction (via our big data API and big data portal). We’ll talk through these technologies and how they are benefiting the Netflix business. We’ll also dive into how we do things differently at Netflix (vs. most other companies), notably the motivations behind our architecture/ approach and the benefits that we (and hopefully you can) achieve.
Kurt Brown leads the data platform team at Netflix, which architects and manages the technical infrastructure underpinning the company’s analytics, including various big data technologies like Hadoop, Spark, and Presto, machine learning infrastructure for Netflix data scientists, and traditional BI tools including Tableau.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.