Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK
Please log in

Getting ready for GDPR and CCPA: Securing and governing hybrid, cloud, and on-premises big data deployments

Mark Donsky (Okera), Ifigeneia Derekli (Cloudera), Lars George (Okera), Michael Ernest (Dataiku)
9:0012:30 Tuesday, 30 April 2019
Data Engineering and Architecture
Location: Capital Suite 10
Secondary topics:  Security and Privacy
Average rating: ****.
(4.00, 2 ratings)

Who is this presentation for?

  • Data architects, security developers, and security architects



Prerequisite knowledge

  • A general understanding of Hadoop concepts, security principles, and cloud concepts, such as S3, EMR, and transient clusters

What you'll learn

  • Learn best practices for security and governing cloud, on-premises, and hybrid deployments, including wire encryption, data at rest encryption, governance best practices, unified, secure, and data catalogs for self-service discovery
  • Understand important aspects of GDPR and CCPA


Many big data environments lack even the most basic security and governance controls. This is due to several factors: some security features didn’t exist as recently as two years ago, and the complexity of Hadoop security has proved daunting to administrators. Nonetheless, with the emergence of regulation such as the California Consumer Protection Act (CCPA) and the General Data Protection Regulation (GDPR), organizations can no longer afford to overlook the criticality of security and governance.

Mark Donsky, Ifigeneia Derekli, Lars George, and Michael Ernest walk you through securing a Hadoop cluster. You’ll start with a cluster with no security and then add security features related to authentication, authorization, encryption of data at rest, encryption of data in transit, and complete data governance.

For each security feature, they cover the following topics:

  • Introduction: What the security feature is, what protection it provides, and best practices and recommendations
  • Planning: How to enable the feature in a phased manner with the fewest growing pains and least risk
  • Relevance: Why it’s important (demonstrated by live attacks against a cluster without the target security feature) and how it relates to GDPR
  • Implementation: An overview of how the implementation is performed, where the moving parts are, and potential pitfalls
Photo of Mark Donsky

Mark Donsky


Mark Donsky is a director of product management at Okera, a software provider that provides discovery, access control, and governance at scale for today’s modern heterogenous data environments, where he leads product management. Previously, Mark led data management and governance solutions at Cloudera, and he’s held product management roles at companies such as Wily Technology, where he managed the flagship application performance management solution, and Silver Spring Networks, where he managed big data analytics solutions that reduced greenhouse gas emissions by millions of dollars annually. He holds a BS with honors in computer science from the Western University, Ontario, Canada.

Photo of Ifigeneia Derekli

Ifigeneia Derekli


Ifi Derekli is a senior solutions engineer at Cloudera, focusing on helping large enterprises solve big data problems using Hadoop technologies. Her subject-matter expertise is around security and governance, a crucial component of every successful production big data use case. Previously, Ifi was a presales technical consultant at Hewlett Packard Enterprise, where she provided technical expertise for Vertica and IDOL (currently part of Micro Focus). She holds a BS in electrical engineering and computer science from Yale University.

Photo of Lars George

Lars George


Lars George is the principal solutions architect at Okera. Lars has been involved with Hadoop and HBase since 2007 and became a full HBase committer in 2009. Previously, Lars was the EMEA chief architect at Cloudera, acting as a liaison between the Cloudera professional services team and customers as well as partners in and around Europe, building the next data-driven solutions, and a cofounding partner of OpenCore, a Hadoop and emerging data technologies advisory firm. He has spoken at many Hadoop User Group meetings as well as at conferences such as ApacheCon, FOSDEM, QCon, and Hadoop World and Hadoop Summit. He also started the Munich OpenHUG meetings. He’s the author HBase: The Definitive Guide from O’Reilly.

Photo of Michael Ernest

Michael Ernest


Michael Ernest is a partner solution architect at Dataiku, supporting technical integration with cloud platforms. He previously led field-enablement programming at Cloudera, where he developed training for new and tenured hires in Hadoop operations, application architecture, and full stack security. He’s published four books on Java programming and Sun Solaris administration. Ernest lives in Berkeley, California.