Presented By O'Reilly and Cloudera
Make Data Work
22–23 May 2017: Training
23–25 May 2017: Tutorials & Conference
London, UK

The IoT is driving the need for more secure big data analytics

16:3517:15 Thursday, 25 May 2017
Level: Intermediate
Average rating: ****.
(4.00, 1 rating)

Who is this presentation for?

  • Those working in IT security and data analytics

Prerequisite knowledge

  • A basic understanding of the technical requirements of managing a Hadoop environment

What you'll learn

  • Learn how data encryption and tokenization can help you protect your Hadoop environment
  • Explore options for securing data and speeding Hadoop implementation


A global telecommunications company ingests 300 million customer records in under 1.5 minutes every day. A mid-size firm handles 3.7 billion transactions annually. A car manufacturer streams real-time sensor data from vehicles. Big data analytics lie at the heart of all these systems, driving transformation, innovation, and identification of new threats, and these projects include massive quantities of sensitive data. But with centralized big data platforms, cyberattackers can now focus on a known, single target—and with IoT connected devices, physical risk is added to the risk of a data breach.

Data privacy regulations, such as the European Union’s General Data Protection Regulation (GDPR), are harmonizing legislation across regions and having global impact. But what really needs to happen is a movement to data-centric security, where data is transformed by encryption that retains value for analytics but not for attackers. Format- and order-preserving encryption can help protect enterprise data. The promise of homomorphic encryption is that nearly all computation can be done directly on encrypted data.

Data privacy as well as security needs to be at the forefront of strategy and architectural considerations, in commissioning any new enterprise application processing sensitive data and in implementing new big data, IoT, mobile, and cloud initiatives. Privacy by design has emerged as an essential best practice in meeting security and privacy compliance mandates, using data-centric security to neutralize sensitive data in use, in motion, and at rest and software development lifecycle security and automated application vulnerability detection during release and operation.

Modern data-centric security is the technology of choice, minimizing exposure of sensitive data and ensuring attackers get nothing of value when they do penetrate systems. It delivers the ability to render data useless if lost or stolen, through data-centric encryption, as an essential benefit to ensure data remains secure. Brendan Rizzo explains how data encryption and tokenization can help you protect your Hadoop environment and outlines options for securing data and speeding Hadoop implementation, drawing on recent deployments in pharma, health insurance, retail, and telecoms to illustrate the impact to operations and other areas of the business.

With these strategies in place, the risk of compromise by insider attack, malware, or accident is vastly reduced, saving the enterprise and its employees from data breaches, potentially costly postbreach remediation, and damage to brand and reputation.

Topics include:

  • How to protect your ecosystem with different encryption methods, such as storage-level and field-level encryption
  • What protection methods should you use, how to use them, and why?
  • Disk-level encryption: strengths and weaknesses
  • What’s the impact of encryption on your network performance?
  • Transparent data encryption (TDE)
  • Ways to balance data security with access for different users with different analytics needs
  • How time and money can be saved by containing the scope of compliance audits
  • Data-centric protection technologies that integrate with Hive, Sqoop, MapReduce, and other Hadoop interfaces

Brendan Rizzo