Hadoop in the cloud is becoming an increasingly common use case, as the cloud provides rapid access to flexible and low-cost IT resources. Similar to traditional on-premises Hadoop clusters, data authorization becomes more crucial than ever for the multitenant cloud. A transparent solution that decouples compute and storage is required for a simple and smooth transition. And since the underlying data is shared across the components, a unified authorization policy should be enforced to adapt the flexibility of Hadoop ecosystem.
Li Li and Hao Hao explore Apache Sentry and RecordService as a solution to address this problem. Apache Sentry is a framework to provide fine-grained authorization as a service, and RecordService is an abstraction layer between computing frameworks and data storage, which can leverage and enforce the Sentry centralized authorization policies.
Li and Hao discuss the architecture of Apache Sentry and RecordService and how the fine-grained access control policies are uniformly enforced in different Hadoop components in the cloud, such as Hive, Solr, Impala, Kafka, Sqoop2, Spark, Pig, and MapReduce, with no performance loss. They also explain how Apache Sentry can leverage the benefits of both role-based access control (RBAC) and attribute-based access control (ABAC).
Li Li is a software engineer on Google’s Cloud team. Previously, Li worked at Cloudera on RecordService and Apache Sentry projects. She is also a committer and PMC of the Apache Sentry (TLP) project. Li holds a master’s degree in computer science from Vanderbilt University.
Hao Hao is a software engineer at Cloudera currently working on the Apache Sentry project, a granular, role-based authorization module for the Hadoop cluster. She is also a PMC of the Apache Sentry (TLP) project. Hao performed extensive research on smartphone security and web security while she was a PhD student at Syracuse University. Prior to joining Cloudera, Hao worked on eBay’s Search Backend team building search infrastructure for eBay’s online buying platform.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.