The current HDFS replication mechanism is expensive: the default triplication scheme has 200% overhead in storage space and other resources (e.g., NameNode memory usage). Erasure coding (EC) can greatly reduce the storage overhead without sacrificing data reliability. The HDFS-EC project (HDFS-7285) aims to build native EC support inside HDFS.
By treating EC as “first class citizen” instead of an external layer (as in the HDFS-RAID project), HDFS-EC brings about several significant benefits. First, it enables flexible and fine-grained EC policies. For large files, EC can be applied on the existing contiguous block layout, preserving data locality and facilitating efficient conversion to and from replication. Small files can also enjoy the benefits of EC by using the striping layout introduced as part of HDFS-EC. The striping layout also enables the client to work with multiple data nodes in parallel, greatly enhancing the aggregate throughput.
Preliminary analysis of several production clusters shows that HDFS-EC can reduce the storage overhead from 200% to 50% on average. To allow the adjustment between storage overhead and data reliability, HDFS-EC supports configurable and pluggable erasure codec algorithms and schemas through a unified framework, where different native libraries can be employed to implement a concrete erasure coder. This is critical to alleviate the performance impact involved by EC on both the client and data node. Benchmark tests indicate that by using the Intel ISA-L library, we can eliminate the CPU bottleneck and achieve 1~3x higher performance compared against other implementations (and >20X speedup compared to an original HDFS-RAID implementation).
Zhe Zhang is a senior manager of core big data infrastructure at LinkedIn, where he leads an excellent engineering team to provide big data services (Hadoop distributed file system (HDFS), YARN, Spark, TensorFlow, and beyond) to power LinkedIn’s business intelligence and relevance applications. Zhe’s an Apache Hadoop PMC member; he led the design and development of HDFS Erasure Coding (HDFS-EC).
Weihua Jiang is the engineering manager at Intel for big data enabling. He has worked on big data since 2011. He was the release manager for Intel’s Hadoop distribution from 2011 to 2014. Currently he is focusing on big data enabling, including optimizing the software stack for better performance and to make the ecosystem enterprise ready.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.