The last year has seen significant growth in the Spark community, with several major releases (Spark 1.0, 1.1, and 1.2), new standard libraries (Spark MLlib and Spark SQL), and an ecosystem of community projects based on Spark.
This talk will provide an overview of Apache Spark and its current feature set, adoption, and use cases. It will then cover recent feature additions to Apache Spark such as elastic scaling support, new algorithms in MLlib, and the Spark SQL datasources API. It will also outline the Spark roadmap for upcoming months. Since this talk is not until May, the specific roadmap details will likely be determined close to the talk itself.
This talk is being submitted by Patrick Wendell, release manager of Spark 1.0, 1.1, and 1.2.
Patrick Wendell is a cofounder of Databricks and committer and PMC member of Apache Spark. He is the release manager of Spark’s 1.0, 1.1, and 1.2 releases. Before helping start Databricks, Patrick was a Ph.D student working in the U.C. Berkeley AMPLab, focusing on large scale data-intensive computing and advised by Ion Stoica.
Comments on this page are now closed.
©2015, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.