Presented By O'Reilly and Cloudera
Make Data Work
31 May–1 June 2016: Training
1 June–3 June 2016: Conference
London, UK

Lessons from integrating Hadoop into an enterprise security architecture

Hellmar Becker (Hortonworks), Frank Albers (ING)
12:05–12:45 Friday, 3/06/2016
Location: Capital Suite 15/16 Level: Intermediate
Average rating: ****.
(4.43, 7 ratings)

Prerequisite knowledge

Attendees should have a basic understanding of software security concepts, the Kerberos protocol, LDAP, and how these components are integrated in Microsoft Active Directory.


How do you connect a Hadoop cluster to an enterprise directory with 100,000+ users and centralized role and access management?

ING has separate security architectures for Unix-based and Windows-based systems. User keys and group memberships can differ in both worlds and are technologically managed by different architecture and support groups of the organization. ING’s security architecture for Hadoop relies on Kerberos for authentication and Apache Ranger for authorization.

When the security architecture was created, the (Windows) Active Directory was used as the authentication point, creating a number of challenges. It required cooperation of the Windows and Linux teams to set up connections and trust relationships. Solutions had to be found to map Windows identities to Linux IDs, which are required for specific Hadoop components to work. Apache Ranger uses LDAP queries for synchronization but these queries scale poorly when the user base is big (100,000+ users). The team also had to figure out how to manage and control keytab files on a Kerberized Hadoop cluster in a safe way and how to manage keytab files on systems that are not managed by Ambari.

Hellmar Becker and Frank Albers present ING’s approach to aligning Hadoop authentication and role management with ING’s policies and architecture, discuss challenges they met on the way, and outline the solutions they found.

Photo of Hellmar Becker

Hellmar Becker


Hellmar Becker is a solutions engineer at Hortonworks, where he is helping spread the word about what you can do with data in the modern world. Hellmar has worked in a number of positions in big data analytics and digital analytics. Previously, he worked at ING Bank implementing the Datalake Foundation project (based on Hadoop) within client information management.

Photo of Frank Albers

Frank Albers


Frank Albers is a software engineer on the big data DevOps team at ING. He specializes in HortonWorks/Hadoop, infrastructure solution patterns for cloud services, architecture, migration to the (private) cloud, security, and networking.

Comments on this page are now closed.


josep rafols
7/06/2016 12:24 BST

It’s poosible to have the slides that you slideshowen on the conference?

thanks in advance

Picture of Olaf Hein
Olaf Hein
6/06/2016 11:13 BST

Thanks for your really interesting talk. As consultant, I’m currently working for 2 different German financial institutes on similar issues. I would like to get into contact with you to exchange some of the experiences. If you are interested, please send me a message using the Attendee Directory.