Presented By O'Reilly and Cloudera
Make Data Work
Sept 29–Oct 1, 2015 • New York, NY

Native erasure coding support inside HDFS

Zhe Zhang (LinkedIn), Weihua Jiang (Intel)
2:55pm–3:35pm Wednesday, 09/30/2015
Hadoop Internals & Development
Location: 1 E16 / 1 E17 Level: Advanced
Average rating: ****.
(4.29, 7 ratings)
Slides:   1-PDF 

The current HDFS replication mechanism is expensive: the default triplication scheme has 200% overhead in storage space and other resources (e.g., NameNode memory usage). Erasure coding (EC) can greatly reduce the storage overhead without sacrificing data reliability. The HDFS-EC project (HDFS-7285) aims to build native EC support inside HDFS.

By treating EC as “first class citizen” instead of an external layer (as in the HDFS-RAID project), HDFS-EC brings about several significant benefits. First, it enables flexible and fine-grained EC policies. For large files, EC can be applied on the existing contiguous block layout, preserving data locality and facilitating efficient conversion to and from replication. Small files can also enjoy the benefits of EC by using the striping layout introduced as part of HDFS-EC. The striping layout also enables the client to work with multiple data nodes in parallel, greatly enhancing the aggregate throughput.

Preliminary analysis of several production clusters shows that HDFS-EC can reduce the storage overhead from 200% to 50% on average. To allow the adjustment between storage overhead and data reliability, HDFS-EC supports configurable and pluggable erasure codec algorithms and schemas through a unified framework, where different native libraries can be employed to implement a concrete erasure coder. This is critical to alleviate the performance impact involved by EC on both the client and data node. Benchmark tests indicate that by using the Intel ISA-L library, we can eliminate the CPU bottleneck and achieve 1~3x higher performance compared against other implementations (and >20X speedup compared to an original HDFS-RAID implementation).

Photo of Zhe Zhang

Zhe Zhang

LinkedIn

Zhe Zhang is a senior manager of core big data infrastructure at LinkedIn, where he leads an excellent engineering team to provide big data services (Hadoop distributed file system (HDFS), YARN, Spark, TensorFlow, and beyond) to power LinkedIn’s business intelligence and relevance applications. Zhe’s an Apache Hadoop PMC member; he led the design and development of HDFS Erasure Coding (HDFS-EC).

Photo of Weihua Jiang

Weihua Jiang

Intel

Weihua Jiang is the engineering manager at Intel for big data enabling. He has worked on big data since 2011. He was the release manager for Intel’s Hadoop distribution from 2011 to 2014. Currently he is focusing on big data enabling, including optimizing the software stack for better performance and to make the ecosystem enterprise ready.