Using Hadoop in the cloud is an increasingly common use case with the cloud providing rapid access to flexible and cheap IT resources. As is the case with traditional on-premises Hadoop clusters, data authorization is crucial for a multitenant cloud. In addition, a transparent solution that decouples compute and storage is required for a simple, smooth experience. Since the underlying data is shared across the components, unified authorization policies must also be enforced across all components to produce a modern and flexible Hadoop ecosystem.
Hao Hao and Alex Leblang explore using Apache Sentry, a framework to provide fine-grained authorization as a service, together with RecordService, an abstraction layer between computing frameworks and data storage, which can leverage and enforce the Sentry centralized authorization policies, as a solution to this problem. They discuss the architecture of Apache Sentry and RecordService and how the fine-grained access control policies are uniformly enforced in different Hadoop components in the cloud with no performance loss—specifically looking at Hive, Solr, Impala, Kafka, Sqoop2, Spark, Pig, and MapReduce. Along the way, Hao and Alex also explain how Apache Sentry can leverage the benefits of both role-based access control (RBAC) and attribute-based access control (ABAC).
Hao Hao is a software engineer at Cloudera currently working on the Apache Sentry project, a granular, role-based authorization module for the Hadoop cluster. She is also a PMC of the Apache Sentry (TLP) project. Hao performed extensive research on smartphone security and web security while she was a PhD student at Syracuse University. Prior to joining Cloudera, Hao worked on eBay’s Search Backend team building search infrastructure for eBay’s online buying platform.
Alex Leblang is an engineer at Cloudera on the RecordService team. Previously, Alex was an Apache Impala (incubating) engineer and interned at Vertica. He holds a bachelor’s degree from Brown University with concentrations in computer science and Latin American studies.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.