Improving Spark by taking advantage of disaggregated architecture

Chenzhao Guo (Intel), Carson Wang (Intel)

5:25pm–6:05pm Wednesday, September 25, 2019

Location: 1A 21/22

Data Engineering and Architecture

Secondary topics: Data, Analytics, and AI Architecture, Deep dive into specific tools, platforms, or frameworks

Average rating:

(5.00, 2 ratings)

Download slides (FILE)

Who is this presentation for?

Data engineers

Level

Intermediate

Description

Shuffle in Apache Spark is a procedure that redistributes data across partitions, which is often costly and requires the shuffle data to be persisted on local disks. There are many scalability and reliability issues in Spark shuffle regarding this procedure. Moreover, the assumptions of collocated storage do not always hold in today’s data centers. The hardware trend is moving to disaggregated storage and compute architecture in order to improve cost efficiency and scalability.

Chenzhao Guo and Carson Wang outline how to address the challenges in Spark shuffle and support disaggregated storage and compute architecture by implementing a new Spark shuffle manager. The new architecture supports writing shuffle data to a remote cluster with different storage backends. The failure of compute node no longer causes recomputation of the shuffle data. Spark executors can also be allocated and recycled dynamically, resulting in better resource use.

For most people running Spark with collocated storage, it’s usually challenging to upgrade the disks on every node to latest hardware like NVMe SSD and persistent memory because of cost consideration and system compatibility. This new shuffle manager enables building a separated cluster for storing and serving the shuffle data by leveraging the latest hardware to improve the performance and reliability. In high-performance computing (HPC) world, more people are starting to use Spark, and this work is also important for them as storage and compute in HPC clusters are typically disaggregated. You’ll leave with an overview of the challenges in the current Spark shuffle implementation and the design of the new shuffle manager. Chenzhao and Carson also present a performance study of the work.

Prerequisite knowledge

A basic understanding of Spark shuffle

What you'll learn

Understand the essence of Spark shuffle and disaggregated architecture

Chenzhao Guo

Intel

Chenzhao Guo is a big data software engineer at Intel. He’s currently a contributor of Spark and a committer of OAP and HiBench. He graduated from Zhejiang University.

Carson Wang

Intel

Carson Wang is a big data software engineer at Intel, where he focuses on developing and improving new big data technologies. He’s an active open source contributor to the Apache Spark and Alluxio projects as well as a core developer and maintainer of HiBench, an open source big data microbenchmark suite. Previously, Carson worked for Microsoft on Windows Azure.