Presented By O'Reilly and Cloudera
Make Data Work
March 13–14, 2017: Training
March 14–16, 2017: Tutorials & Conference
San Jose, CA

A practitioner’s guide to securing your Hadoop cluster

Mark Donsky (Okera), Andre Araujo (Cloudera), Michael Yoder (Cloudera), Manish Ahluwalia (Nerdwallet)
1:30pm5:00pm Tuesday, March 14, 2017
Platform Security and Cybersecurity
Location: LL21 A Level: Intermediate
Average rating: ****.
(4.50, 2 ratings)

Who is this presentation for?

  • Hadoop admins and security ops

Prerequisite knowledge

  • General knowledge of Hadoop and system admin procedures

Materials or downloads needed in advance

  • A WiFi-enabled laptop with the ability to run an SSH client

What you'll learn

  • Learn how to secure a Hadoop cluster for production operations


Many Hadoop clusters lack even the most basic security controls. This is due to several factors: some security features did not exist as recently as two years ago, and the complexity of Hadoop security has proved daunting to administrators.

Mark Donsky, André Araujo, Michael Yoder, and Manish Ahluwalia walk you through securing a Hadoop cluster. You’ll start with a cluster with no security and then add security features related to authentication, authorization, encryption of data at rest, encryption of data in transit, and complete data governance.

For each security feature, Mark, André, Michael, and Manish cover the following topics:

  • Introduction: What the security feature is, what protection it provides, and best practices and recommendations
  • Planning: How to enable the feature in a phased manner with the fewest growing pains and least risk
  • Relevance: Why it’s important (demonstrated by live attacks against a cluster without the target security feature)
  • Implementation: An overview of how the implementation is performed, where the moving parts are, and potential pitfalls
Photo of Mark Donsky

Mark Donsky


Mark Donsky is a director of product management at Okera, a software provider that provides discovery, access control, and governance at scale for today’s modern heterogenous data environments, where he leads product management. Previously, Mark led data management and governance solutions at Cloudera, and he’s held product management roles at companies such as Wily Technology, where he managed the flagship application performance management solution, and Silver Spring Networks, where he managed big data analytics solutions that reduced greenhouse gas emissions by millions of dollars annually. He holds a BS with honors in computer science from the Western University, Ontario, Canada.

Photo of Andre Araujo

Andre Araujo


André Araujo is a principal solutions architect at Cloudera. An experienced consultant with a deep understanding of the Hadoop stack and its components and a methodical and keen troubleshooter who loves making things run faster, André is skilled across the entire Hadoop ecosystem and specializes in building high-performance, secure, robust, and scalable architectures to fit customers’ needs.

Photo of Michael Yoder

Michael Yoder


Mike Yoder is a software engineer at Cloudera who has worked on a variety of Hadoop security features and internal security initiatives. Most recently, he implemented log redaction and the encryption of sensitive configuration values in Cloudera Manager. Prior to Cloudera, he was a security architect at Vormetric.

Photo of Manish Ahluwalia

Manish Ahluwalia


Manish Ahluwalia is a security engineering at Nerdwallet. Manish has held software architect roles at Tibco Loglogic and Thales Vormetric and was a security engineer at Cloudera, where he focused on the security of the Hadoop ecosystem. Manish has been working in big data since its infancy in various companies in Silicon Valley. He is most passionate about security.