Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Schedule: Security and Privacy sessions

Recent regulations in Europe (GDPR) and California (Consumer Privacy Act) have placed concepts like “user control” and “privacy-by-design” at the forefront for companies wanting to deploy ML. The good news is that there are new privacy-preserving tools and techniques – including differential privacy – that are becoming available for both business intelligence and ML applications.

  • Data security and privacy: A recent white paper from the Hoover Institution observed that we are beginning to see the convergence of data privacy and security. This is an age when companies are guarding against the misuse of data, either by adversaries or by parties they presently trust but may not longer do so in the future: “Anyone, from a privacy perspective, can become an adversary, given enough time.”
  • The use of data, analytics, and machine learning in security and cybersecurity.
  • Secure and robust analytics, including secure machine learning and aspects of machine deception (such as machines deceiving machines, or people deceiving machines).
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Data Engineering and Architecture
Location: Capital Suite 10
Mark Donsky (Okera), Ifigeneia Derekli (Cloudera), Lars George (Okera), Michael Ernest (Okera)
New regulations such as CCPA and GDPR are driving new compliance, governance, and security challenges for big data. Infosec and security groups must ensure a consistently secured and governed environment across multiple workloads. Mark Donsky, Ifigeneia Derekli, Lars George, and Michael Ernest share hands-on best practices for meeting these challenges, with special attention paid to CCPA. Read more.
Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Law and Ethics, Strata Business Summit
Location: Capital Suite 4
Sundeep Reddy Mallu (Gramener Inc)
Answering simple question of what rights do Indian citizens have over their data is a nightmare. The rollout of India Stack technology based solutions has added fuel to fire. Sundeep explains, with on ground examples, how businesses and citizens are navigating the India Stack ecosystem while dealing with Data privacy, security & Ethics space in India's booming digital economy. Read more.
Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Felipe Hoffa (Google)
Before releasing a public dataset, practitioners need to thread the needle between utility and protection of individuals. We will explore massive public datasets, taking you from theory to real life showcasing newly available tools that help with PII detection and brings concepts like k-anonymity and l-diversity to the practical realm (with options such as removing, masking, and coarsening). Read more.
Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 15/16
The application of AI algorithms in domains such as criminal justice, credit scoring, and hiring holds unlimited promise. At the same time, it raises legitimate concerns about algorithmic fairness. There is a growing demand for fairness, accountability, and transparency from machine learning (ML) systems. In this talk we cover how to build just such a pipeline leveraging open source tools. Read more.
Add to your personal schedule
14:0514:45 Wednesday, 1 May 2019
Mark Donsky (Okera), Nikki Rouda (Amazon Web Services)
The implications of new privacy regulations for data management and analytics, such as the General Data Protection Regulation (GDPR) and the upcoming California Consumer Protection Act (CCPA), can seem complex. Mark Donsky and Nikki Rouda highlight aspects of the rules and outline the approaches that will assist with compliance. Read more.
Add to your personal schedule
14:0514:45 Wednesday, 1 May 2019
Data Science, Machine Learning & AI, Expo Hall
Location: Expo Hall (Capital Hall N24)
Mikio Braun (Zalando SE)
Mikio Braun explores techniques and concepts around fairness, privacy, and security when it comes to machine learning models. Read more.
Add to your personal schedule
14:5515:35 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Mark Grover (Lyft), Deepak Tiwari (Lyft)
Lyft’s data platform is at the heart of Lyft’s business. Decisions all the way from pricing, to ETA, to business operations rely on Lyft’s data platform. Moreover, it powers the enormous scale and speed at which Lyft operates. In this talk, Mark Grover walks through various choices Lyft has made in the development and sustenance of the data platform and why along with what lies ahead in future. Read more.
Add to your personal schedule
14:5515:35 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 17
Alexander Adam (Faculty)
The advent of "fake news" has led us to doubt the truth of online media; and advances in machine learning give us an even greater reason to question what we are seeing. Despite the many beneficial applications of this technology, it's also potentially very dangerous. Alex Adam explains how synthetic videos are created and how they can be detected. Read more.
Add to your personal schedule
16:3517:15 Wednesday, 1 May 2019
Case studies, Strata Business Summit
Location: Capital Suite 12
Maurício Lins (everis consultancy UK), Lidia Crespo (Santander UK)
Big data is usually regarded as a menace for data privacy. However, with the right principles and mind-set, it can be a game changer to put customers first and consider data privacy an inalienable right. Santander UK applied this model to comply with GDPR by using graph technology, Hadoop, Spark, Kudu to drive data obscuring and data portability, and driving machine learning exploration. Read more.
Add to your personal schedule
16:3517:15 Wednesday, 1 May 2019
Data Science, Machine Learning & AI, Expo Hall
Location: Expo Hall (Capital Hall N24)
Maren Eckhoff (QuantumBlack)
The success of machine learning algorithms in a wide range of domains has led to a desire to leverage their power in ever more areas. Maren Eckhoff discusses modern explainability techniques that increase the transparency of black box algorithms, drive adoption, and help manage ethical, legal, and business risks. Many of these methods can be applied to any model without limiting performance. Read more.
Add to your personal schedule
17:2518:05 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
Chris Wallace (Cloudera)
Imagine building a model whose training data is collected on edge devices such as cell phones or sensors. Each device collects data unlike any other, and the data cannot leave the device because of privacy concerns or unreliable network access. This challenging situation is known as federated learning. Chris Wallace discusses the algorithmic solutions and the product opportunities. Read more.
Add to your personal schedule
10:1510:35 Thursday, 2 May 2019
Location: Auditorium
Sandra Wachter (University of Oxford)
Keynote with Sandra Wachter Read more.
Add to your personal schedule
11:1511:55 Thursday, 2 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 17
Scott Stevenson (Faculty)
Modern deep learning systems allow us to build speech synthesis systems with the naturalness of a human speaker. Whilst there are myriad benevolent applications, this also ushers in a new era of fake news. This talk will explore the danger of such systems, as well as how deep learning can also be used to build countermeasures to protect against political disinformation. Read more.
Add to your personal schedule
11:1511:55 Thursday, 2 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
David Dogon (Van Lanschot Kempen)
This talk discusses a best practice use case for detecting fraud at a financial institution. Where traditional systems fall short, machine learning models can provide a solution. Sifting through large amounts of transaction data, external hit lists, and unstructured text data we managed to build a dynamic and robust monitoring system that successfully detects unwanted client behavior. Read more.
Add to your personal schedule
11:1511:55 Thursday, 2 May 2019
Data Engineering and Architecture
Location: Capital Suite 10/11
Eoin O'Flanagan (Newday), Darragh McConville (Kainos)
In this session you will learn how we have built a high-performance contemporary data processing platform, from the ground up, on AWS. We will discuss our journey from legacy, onsite, traditional data estate to an entirely cloud-based, PCI DSS-compliant platform. Read more.
Add to your personal schedule
12:0512:45 Thursday, 2 May 2019
Alasdair Allan (Babilim Light Industries)
A arrival of new generation of smart embedded hardware may cause the demise of large scale data harvesting. In its place smart devices will allow us process data at the edge, allowing us to extract insights from the data without storing potentially privacy and GDPR infringing data. The current age where privacy is no longer "a social norm" may not long survive the coming of the Internet of Things. Read more.
Add to your personal schedule
12:0512:45 Thursday, 2 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Vaclav Surovec (Deutsche Telekom), Gabor Kotalik (Deutsche Telekom)
Knowledge of customers' location and travel patterns is important for many companies, including German telco service operator Deutsche Telekom. Václav Surovec and Gabor Kotalik explain how a commercial roaming project using Cloudera Hadoop helped the company better analyze the behavior of its customers from 10 countries and provide better predictions and visualizations for management. Read more.
Add to your personal schedule
14:0514:45 Thursday, 2 May 2019
Marcel Ruiz Forns (Wikimedia Foundation)
Analysts and researchers studying Wikipedia are hungry for long term data to build experiments and feed data-driven decisions. But Wikipedia has a strict privacy policy that prevents storing privacy-sensitive data over 90 days. The Wikimedia Foundation's analytics team is working on a vegan data diet to satisfy both. Read more.
Add to your personal schedule
16:3517:15 Thursday, 2 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
Brennan Lodge (Goldman Sachs), Jay Kesavan (Bowery Analytics LLC)
Cyber security analysts are under siege to keep pace with the ever-changing threat landscape. The analysts are overworked, burnout and bombarded with the sheer number of alerts that they must carefully investigate. To empower our cyber security analysts we can use a data science model for alert evaluations. Read more.