Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Schedule: Security and Privacy sessions

Recent regulations in Europe (GDPR) and California (Consumer Privacy Act) have placed concepts like “user control” and “privacy-by-design” at the forefront for companies wanting to deploy ML. The good news is that there are new privacy-preserving tools and techniques – including differential privacy – that are becoming available for both business intelligence and ML applications.

  • Data security and privacy: A recent white paper from the Hoover Institution observed that we are beginning to see the convergence of data privacy and security. This is an age when companies are guarding against the misuse of data, either by adversaries or by parties they presently trust but may not longer do so in the future: “Anyone, from a privacy perspective, can become an adversary, given enough time.”
  • The use of data, analytics, and machine learning in security and cybersecurity.
  • Secure and robust analytics, including secure machine learning and aspects of machine deception (such as machines deceiving machines, or people deceiving machines).
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Data Engineering and Architecture
Location: Capital Suite 10
Mark Donsky (Okera), Ifigeneia Derekli (Cloudera), Lars George (Okera), Michael Ernest (Okera)
Average rating: ****.
(4.00, 2 ratings)
New regulations such as CCPA and GDPR are driving new compliance, governance, and security challenges for big data. Infosec and security groups must ensure a consistently secured and governed environment across multiple workloads. Mark Donsky, Ifigeneia Derekli, Lars George, and Michael Ernest share hands-on best practices for meeting these challenges, with special attention paid to CCPA. Read more.
Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Law and Ethics
Location: Capital Suite 4
Sundeep Reddy Mallu (Gramener)
Average rating: *****
(5.00, 4 ratings)
Answering the simple question of what rights Indian citizens have over their data is a nightmare. The rollout of India Stack technology-based solutions has added fuel to fire. Sundeep Reddy Mallu explains, with on-the-ground examples, how businesses and citizens in India's booming digital economy are navigating the India Stack ecosystem while dealing with data privacy, security, and ethics. Read more.
Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Felipe Hoffa (Google)
Average rating: ***..
(3.50, 4 ratings)
Before releasing a public dataset, practitioners need to thread the needle between utility and protection of individuals. Felipe Hoffa explores how to handle massive public datasets, taking you from theory to real life as he showcases newly available tools that help with PII detection and bring concepts like k-anonymity and l-diversity to the practical realm. Read more.
Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 15/16
Average rating: ****.
(4.75, 4 ratings)
The application of AI algorithms in domains such as criminal justice, credit scoring, and hiring holds unlimited promise. At the same time, it raises legitimate concerns about algorithmic fairness. There's a growing demand for fairness, accountability, and transparency from machine learning (ML) systems. Nick Pentreath explains how to build just such a pipeline leveraging open source tools. Read more.
Add to your personal schedule
14:0514:45 Wednesday, 1 May 2019
Mark Donsky (Okera), Nikki Rouda (Amazon Web Services)
Average rating: ****.
(4.67, 3 ratings)
The implications of new privacy regulations for data management and analytics, such as the General Data Protection Regulation (GDPR) and the upcoming California Consumer Protection Act (CCPA), can seem complex. Mark Donsky and Nikki Rouda highlight aspects of the rules and outline the approaches that will assist with compliance. Read more.
Add to your personal schedule
14:0514:45 Wednesday, 1 May 2019
Data Science, Machine Learning & AI, Expo Hall
Location: Expo Hall (Capital Hall N24)
Mikio Braun (Zalando SE)
Average rating: *****
(5.00, 3 ratings)
Mikio Braun explores techniques and concepts around fairness, privacy, and security when it comes to machine learning models. Read more.
Add to your personal schedule
14:5515:35 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Mark Grover (Lyft), Deepak Tiwari (Lyft)
Average rating: ****.
(4.69, 13 ratings)
Lyft’s data platform is at the heart of the company's business. Decisions from pricing to ETA to business operations rely on Lyft’s data platform. Moreover, it powers the enormous scale and speed at which Lyft operates. Mark Grover and Deepak Tiwari walk you through the choices Lyft made in the development and sustenance of the data platform, along with what lies ahead in the future. Read more.
Add to your personal schedule
14:5515:35 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 17
Alexander Adam (Faculty)
Average rating: ****.
(4.00, 1 rating)
The advent of "fake news" has led us to doubt the truth of online media, and advances in machine learning give us an even greater reason to question what we are seeing. Despite the many beneficial applications of this technology, it's also potentially very dangerous. Alex Adam explains how synthetic videos are created and how they can be detected. Read more.
Add to your personal schedule
16:3517:15 Wednesday, 1 May 2019
Case studies, Strata Business Summit
Location: Capital Suite 12
Maurício Lins (everis consultancy UK), Lidia Crespo (Santander UK)
Average rating: ****.
(4.50, 4 ratings)
Big data is usually regarded as a menace to data privacy. But with data privacy principles and a customer-first mindset, it can be a game changer. Maurício Lins and Lidia Crespo explain how Santander UK applied this model to comply with GDPR, using graph technology, Hadoop, Spark, and Kudu to drive data obscuring, data portability, and machine learning exploration. Read more.
Add to your personal schedule
16:3517:15 Wednesday, 1 May 2019
Data Science, Machine Learning & AI, Expo Hall
Location: Expo Hall (Capital Hall N24)
Maren Eckhoff (QuantumBlack)
Average rating: ****.
(4.50, 4 ratings)
The success of machine learning algorithms in a wide range of domains has led to a desire to leverage their power in ever more areas. Maren Eckhoff discusses modern explainability techniques that increase the transparency of black box algorithms, drive adoption, and help manage ethical, legal, and business risks. Many of these methods can be applied to any model without limiting performance. Read more.
Add to your personal schedule
17:2518:05 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
Chris Wallace (Cloudera)
Average rating: *****
(5.00, 4 ratings)
Imagine building a model whose training data is collected on edge devices such as cell phones or sensors. Each device collects data unlike any other, and the data cannot leave the device because of privacy concerns or unreliable network access. This challenging situation is known as federated learning. Chris Wallace discusses the algorithmic solutions and the product opportunities. Read more.
Add to your personal schedule
10:1510:35 Thursday, 2 May 2019
Location: Auditorium
Sandra Wachter (University of Oxford)
Average rating: ****.
(4.65, 20 ratings)
Big data analytics and AI draw nonintuitive and unverifiable inferences about the behaviors, preferences, and lives of individuals. These inferences draw on diverse and feature-rich data of unpredictable value and create new opportunities for discriminatory, biased, and invasive decision making. Sandra Wachter discusses how this expands potential victims of discrimination and potential harm. Read more.
Add to your personal schedule
11:1511:55 Thursday, 2 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 17
Scott Stevenson (Faculty)
Average rating: *****
(5.00, 4 ratings)
Modern deep learning systems allow us to build speech synthesis systems with the naturalness of a human speaker. While there are myriad benevolent applications, this also ushers in a new era of fake news. Scott Stevenson explores the danger of such systems and details how deep learning can also be used to build countermeasures to protect against political disinformation. Read more.
Add to your personal schedule
11:1511:55 Thursday, 2 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
David Dogon (Van Lanschot Kempen)
Average rating: ****.
(4.75, 8 ratings)
David Dogon dives into a best practice use case for detecting fraud at a financial institution and details a dynamic and robust monitoring system that successfully detects unwanted client behavior. Join in to learn how machine learning models can provide a solution in cases where traditional systems fall short. Read more.
Add to your personal schedule
11:1511:55 Thursday, 2 May 2019
Data Engineering and Architecture
Location: Capital Suite 10/11
Eoin O'Flanagan (NewDay), Darragh McConville (Kainos)
Average rating: ****.
(4.86, 7 ratings)
Eoin O'Flanagan and Darragh McConville explain how NewDay built a high-performance contemporary data processing platform from the ground up on AWS. Join in to explore the company's journey from a traditional legacy onsite data estate to an entirely cloud-based PCI DSS-compliant platform. Read more.
Add to your personal schedule
12:0512:45 Thursday, 2 May 2019
Alasdair Allan (Babilim Light Industries)
Average rating: *****
(5.00, 4 ratings)
Alasdair Allan explains why the current age, where privacy is no longer "a social norm," may not long survive the coming of the internet of things, as new smart embedded hardware may cause the demise of large-scale data harvesting. Smart devices will process data at the edge, allowing us to extract insights from the data without storing potentially privacy- and GDPR-infringing data. Read more.
Add to your personal schedule
12:0512:45 Thursday, 2 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Vaclav Surovec (Deutsche Telekom), Gabor Kotalik (Deutsche Telekom)
Average rating: ****.
(4.00, 2 ratings)
Knowledge of customers' location and travel patterns is important for many companies, including German telco service operator Deutsche Telekom. Václav Surovec and Gabor Kotalik explain how a commercial roaming project using Cloudera Hadoop helped the company better analyze the behavior of its customers from 10 countries and provide better predictions and visualizations for management. Read more.
Add to your personal schedule
14:0514:45 Thursday, 2 May 2019
Marcel Ruiz Forns (Wikimedia Foundation)
Average rating: ****.
(4.75, 4 ratings)
Analysts and researchers studying Wikipedia are hungry for long-term data to build experiments and feed data-driven decisions. But Wikipedia has a strict privacy policy that prevents storing privacy-sensitive data over 90 days. Marcel Ruiz Forns explains how the Wikimedia Foundation's analytics team is working on a vegan data diet to satisfy both. Read more.
Add to your personal schedule
14:0514:45 Thursday, 2 May 2019
Data Engineering and Architecture
Location: Capital Suite 10/11
Tom Walwyn (Cloudflare)
Average rating: ****.
(4.00, 1 rating)
Cloudflare powers nearly 10 percent of all Internet requests worldwide, absorbing some of the largest DDoS attacks. Learn how we use ClickHouse and SQL to simplify our data pipelines on a global scale while experiencing over 10 million events per second. Read more.
Add to your personal schedule
16:3517:15 Thursday, 2 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
Brennan Lodge (Goldman Sachs), Jay Kesavan (Bowery Analytics LLC)
Average rating: ***..
(3.00, 3 ratings)
Cybersecurity analysts are under siege to keep pace with the ever-changing threat landscape. The analysts are overworked as they are bombarded with and burned out by the sheer number of alerts that they must carefully investigate. Brennan Lodge and Jay Kesavan explain how to use a data science model for alert evaluations to empower your cybersecurity analysts. Read more.