Get the free Ebook:
Private and Open Data in Asia: A Regional Guide.
In this talk, Reynold will look back and review Spark’s growth in adoption, use cases, and development. He will then look forward and discuss both technical initiatives and the evolution of the Spark community for 2016.
2015 is the year of data science and platformization for Apache Spark. With new high-level APIs (e.g. DataFrames, machine learning pipelines, R) and extension points, Spark is accessible to a wider set of users and can plug in a myriad of data sources, algorithms, and external packages. 2015 also marks the beginning of Project Tungsten, a major revamp of Spark’s execution engine to improve its robustness and performance. In 2016, we will continue pushing the boundaries of these dimensions, making Spark even easier and more powerful.
Reynold Xin is a cofounder and chief architect at Databricks as well as an Apache Spark PMC member and release manager for Spark’s 2.0 release. Prior to Databricks, Reynold was pursuing a PhD at the UC Berkeley AMPLab, where he worked on large-scale data processing.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.