Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Getting ready for GDPR: securing and governing hybrid, cloud and on-prem big data deployments

Mark Donsky (Cloudera), Andre Araujo (Cloudera), Syed Rafice (Cloudera), Mubashir Kazia (Cloudera)
9:00am12:30pm Tuesday, March 6, 2018
Data engineering and architecture, Law, ethics, and governance
Location: LL20 C Level: Intermediate

Who is this presentation for?

  • Those working in infosec, security admins, data stewards, and data curators

Prerequisite knowledge

  • A general understanding of Hadoop concepts, security principles, and cloud concepts (S3, EMR, transient clusters, etc.)

Materials or downloads needed in advance

  • A laptop

What you'll learn

  • Learn how to secure a Hadoop cluster
  • Understand best practices for security and governing cloud, on-premises, and hybrid deployments
  • Explore important aspects of GDPR


Many Hadoop clusters lack even the most basic security and governance controls. This is due to several factors: some security features did not exist as recently as two years ago, and the complexity of Hadoop security has proved daunting to administrators. Nonetheless, with the emergency of regulation such as GDPR, organizations can no longer afford to overlook the criticality of security and governance.

Mark Donsky walks you through securing a Hadoop cluster. You’ll start with a cluster with no security and then add security features related to authentication, authorization, encryption of data at rest, encryption of data in transit, and complete data governance.

For each security feature, you’ll cover the following topics:

  • What the security feature is, what protection it provides, and best practices and recommendations
  • How to enable the feature in a phased manner with the fewest growing pains and least risk
  • Why it’s important (demonstrated by live attacks against a cluster without the target security feature) and how it relates to GDPR
  • How the implementation is performed, where the moving parts are, and potential pitfalls
Photo of Mark Donsky

Mark Donsky


Mark Donsky leads data management and governance solutions at Cloudera. Previously, Mark held product management roles at companies such as Wily Technology, where he managed the flagship application performance management solution, and Silver Spring Networks, where he managed big data analytics solutions that reduced greenhouse gas emissions. He holds a BS with honors in computer science from the University of Western Ontario.

Photo of Andre Araujo

Andre Araujo


André Araujo is a solutions architect with Cloudera. Previously, he was an Oracle database administrator. An experienced consultant with a deep understanding of the Hadoop stack and its components, André is skilled across the entire Hadoop ecosystem and specializes in building high-performance, secure, robust, and scalable architectures to fit customers’ needs. André is a methodical and keen troubleshooter who loves making things run faster.

Photo of Syed Rafice

Syed Rafice


Syed Rafice is a senior system engineer at Cloudera, where he specializes in big data on Hadoop technologies and is responsible for designing, building, developing, and assuring a number of enterprise-level big data platforms using the Cloudera distribution. Syed also focuses on both platform and cybersecurity. He has worked across multiple sectors, including government, telecoms, media, utilities, financial services, and transport.

Mubashir Kazia


Mubashir Kazia is a Principal Solutions Architect at Cloudera. Mubashir is an SME in Apache Hadoop Security in Cloudera’s Professional Services practice. Mubashir helps customers secure their Hadoop clusters and comply to internal security policies. He also helps new customers transition to Hadoop platform and help implement their first few use cases. At Cloudera, Mubashir has worked with customers from all verticals including banking, manufacturing, healthcare, telecom, retail, gaming, etc. Mubashir also trains and mentors peers in Hadoop and Hadoop Security. Before joining Cloudera Mubashir has extensively worked on developing solutions for leading investment banking firms.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)