Presented By O'Reilly and Cloudera
Make Data Work
December 1–3, 2015 • Singapore

State of Spark, and where it is going

Reynold Xin (Databricks)
9:55am–10:05am Thursday, 12/03/2015
Location: Summit 1-2
Average rating: ***..
(3.94, 34 ratings)

In this talk, Reynold will look back and review Spark’s growth in adoption, use cases, and development. He will then look forward and discuss both technical initiatives and the evolution of the Spark community for 2016.

2015 is the year of data science and platformization for Apache Spark. With new high-level APIs (e.g. DataFrames, machine learning pipelines, R) and extension points, Spark is accessible to a wider set of users and can plug in a myriad of data sources, algorithms, and external packages. 2015 also marks the beginning of Project Tungsten, a major revamp of Spark’s execution engine to improve its robustness and performance. In 2016, we will continue pushing the boundaries of these dimensions, making Spark even easier and more powerful.

Photo of Reynold Xin

Reynold Xin


Reynold Xin is a cofounder and chief architect at Databricks as well as an Apache Spark PMC member and release manager for Spark’s 2.0 release. Prior to Databricks, Reynold was pursuing a PhD at the UC Berkeley AMPLab, where he worked on large-scale data processing.