Presented By O'Reilly and Cloudera
Make Data Work
5–7 May, 2015 • London, UK

Transparent encryption in HDFS

Charles Lamb (Cloudera), Andrew Wang (Cloudera)
14:35–15:15 Thursday, 7/05/2015
Hadoop Platform
Location: King's Suite - Sandringham
Average rating: ****.
(4.50, 2 ratings)
Slides:   1-PPTX 

Prerequisite Knowledge

Working knowledge of HDFS as a user, administrator, or programmer.

Description

Data encryption is a requirement for many business sectors dealing with confidential information, such as finance, healthcare, and government. For example, HIPAA, FISMA, and DCI all require that data is encrypted while it is in-flight (being transferred over the network) and when it is at-rest (stored durably on disk). There can also be additional restrictions surrounding access, management, and storage of encryption keys.

To meet these requirements, transparent, end-to-end encryption was added to HDFS. Once configured, data read from and written to certain HDFS directories is transparently encrypted and decrypted without requiring any changes to user application code. This encryption is also end-to-end, meaning that data is protected both in-flight and at-rest, and can only be encrypted and decrypted by the client. This improves security since HDFS itself never handles unencrypted data or data encryption keys. Furthermore, through the use of a new cluster service, the Hadoop Key Management Server (KMS), the responsibilities of key administration and HDFS administration can be separated, further enhancing security.

During this talk, we will cover the design, implementation, and usage of transparent encryption in HDFS. We will also cover performance results demonstrating the benefits of hardware crypto acceleration (AES-NI).

Photo of Charles Lamb

Charles Lamb

Cloudera

Software Engineer with 30+ years of experience developing DBMS software. S.M., S.B. Computer Science, MIT.

Photo of Andrew Wang

Andrew Wang

Cloudera

Andrew is a software engineer on the HDFS team at Cloudera. Previously, he was a graduate student in the AMPLab at the University of California, Berkeley advised by Prof. Ion Stoica, where he worked on research related to in-memory caching and quality-of-service. In his spare time he enjoys going on bike rides, cooking, and playing guitar.

Comments on this page are now closed.

Comments

Yann Barraud
21/05/2015 16:05 BST

Hi,

Will presentation be available somewhere ?