Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

A practitioner’s guide to securing your Hadoop cluster

Michael Yoder (Cloudera), Benjamin Spivey (Cloudera), Mark Donsky (Okera), Mubashir Kazia (Cloudera)
9:00am–12:30pm Tuesday, 09/27/2016
Location: 1 E 09 Level: Intermediate
Average rating: ****.
(4.22, 9 ratings)

Prerequisite knowledge

  • General knowledge of Hadoop and system admin procedures
  • Materials or downloads needed in advance

  • A laptop with Internet access and the ability to run an ssh client
  • What you'll learn

  • How to secure a Hadoop cluster for production operations
  • Description

    Many Hadoop clusters lack even the most basic security controls. This is due to several factors: some security features did not exist as recently as two years ago, and the complexity of Hadoop security has proved daunting to administrators.

    Michael Yoder, Ben Spivey, Mark Donsky, and Mubashir Kazia walk you through securing a Hadoop cluster. You’ll start with a cluster with no security and then add security features related to authentication, authorization, encryption of data at rest, encryption of data in transit, and complete data governance.

    For each security feature, Michael, Ben, Mark, and Mubashir cover the following topics:

    • Introduction: what the security feature is, what protection it provides, and best practices and recommendations
    • Planning: how to enable the feature in a phased manner with the fewest growing pains and least risk
    • Relevance: why it’s important (demonstrated by live attacks against a cluster without the target security feature)
    • Implementation: an overview of how the implementation is performed, where the moving parts are, and potential pitfalls
    Photo of Michael Yoder

    Michael Yoder


    Mike Yoder is a software engineer at Cloudera who has worked on a variety of Hadoop security features and internal security initiatives. Most recently, he implemented log redaction and the encryption of sensitive configuration values in Cloudera Manager. Prior to Cloudera, he was a security architect at Vormetric.

    Photo of Benjamin Spivey

    Benjamin Spivey


    Ben Spivey is a principal solutions architect at Cloudera providing consulting services for large financial-services customers. Ben specializes in Hadoop security and operations. He is the coauthor of Hadoop Security from O’Reilly Media (2015).

    Photo of Mark Donsky

    Mark Donsky


    Mark Donsky is a director of product management at Okera, a software provider that provides discovery, access control, and governance at scale for today’s modern heterogenous data environments, where he leads product management. Previously, Mark led data management and governance solutions at Cloudera, and he’s held product management roles at companies such as Wily Technology, where he managed the flagship application performance management solution, and Silver Spring Networks, where he managed big data analytics solutions that reduced greenhouse gas emissions by millions of dollars annually. He holds a BS with honors in computer science from the Western University, Ontario, Canada.

    Photo of Mubashir Kazia

    Mubashir Kazia


    Mubashir Kazia is a principal solutions architect at Cloudera and an SME in Apache Hadoop security in Cloudera’s Professional Services practice, where he helps customers secure their Hadoop clusters and comply to internal security policies. He also helps new customers transition to Hadoop platform and implement their first few use cases and trains and mentors peers in Hadoop and Hadoop security. Mubashir has worked with customers from all verticals, including banking, manufacturing, healthcare, telecom, retail, and gaming. Previously, he worked on developing solutions for leading investment banking firms.

    Comments on this page are now closed.


    Picture of Sophia DeMartini
    Sophia DeMartini
    10/09/2016 7:24pm EDT

    The slides have been posted – they’re located at the top of this page. There is a button which says “Download PPTX”.

    If you’re unable to download them, please leave a comment on this page.

    Picture of Mark Donsky
    Mark Donsky
    10/09/2016 6:54pm EDT

    I will follow up with O’Reilly tomorrow and see what’s going on. Sorry for the delay!

    10/09/2016 6:45pm EDT

    Mark Donsky
    during session you said you will post your desk?
    I dont see that yet?

    Landy Reyes
    10/03/2016 12:39pm EDT

    Congrats! Excellent session, when the presentation will be available?

    Yasha B
    10/03/2016 5:40am EDT

    When can we have the URL to the presentation posted here? Thanks much!

    Vinod Balasubramanian
    09/27/2016 11:18am EDT

    Thanks Mark.

    Picture of Mark Donsky
    Mark Donsky
    09/27/2016 8:28am EDT

    We’ll post them here just as soon as the session finishes today.

    09/27/2016 7:59am EDT

    How can I get the decks from the session?