Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Apache Spark 2.4 and beyond

Xiao Li (Databricks), Wenchen Fan (Databricks)
2:40pm3:20pm Thursday, March 28, 2019
Average rating: ***..
(3.25, 4 ratings)

Who is this presentation for?

  • Software engineers and tech leads

Level

Beginner

Prerequisite knowledge

  • A basic understanding of Apache Spark

What you'll learn

  • Understand the new features in Apache Spark 2.4 release
  • Get insight into upcoming releases

Description

Apache Spark 2.4 comes packed with a lot of new functionalities and improvements, including the new barrier execution mode, flexible streaming sink, the native AVRO data source, PySpark’s eager evaluation mode, Kubernetes support, higher-order functions, Scala 2.12 support, and more.

Xiao Li and Wenchen Fan offer an overview of the major features and enhancements in Apache Spark 2.4. Along the way, you’ll learn about the design and implementation of V2 of theData Source API and catalog federation in the upcoming Spark release. Then you’ll get the chance to ask all your burning Spark questions.

Photo of Xiao Li

Xiao Li

Databricks

Xiao Li is a software engineer, Apache Spark committer, and PMC member at Databricks. His main interests are Spark SQL, data replication, and data integration. Previously, he was an IBM master inventor and an expert on asynchronous database replication and consistency verification. He holds a PhD from the University of Florida.

Photo of Wenchen Fan

Wenchen Fan

Databricks

Wenchen Fan is a software engineer at Databricks, working on Spark Core and Spark SQL, as well as a Spark committer and a Spark PMC member. He mainly focuses on the Apache Spark open source community, leading the discussion and reviews of many features and fixes in Spark.