Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Schedule: Law, ethics, and governance sessions

Open data and heightened privacy concerns mean new, and often controversial, thinking on governance, ethics, and compliance, as well as a renegotiation of the pact we make with a life lived in public.

Add to your personal schedule
9:00am12:30pm Tuesday, March 6, 2018
Location: LL20 C Level: Intermediate
Mark Donsky (Cloudera), Andre Araujo (Cloudera), Syed Rafice (Cloudera), Mubashir Kazia (Cloudera)
New regulations such as GDPR are driving new compliance, governance, and security challenges for big data. Infosec and security groups must ensure a consistently secured and governed environment across multiple workloads that span on-premises, private cloud, multicloud, and hybrid cloud deployments. Mark Donsky walks you through securing a Hadoop cluster, with special attention to GDPR. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 7, 2018
Location: 210 C/G Level: Intermediate
Anne Buff (SAS Institute)
Emerging technologies such as the IoT, AI, and ML present businesses with enormous opportunities for innovation, but to maximize the potential of these technologies, businesses must radically shift their approach to governance. Anne Buff explains what it takes to shift the focus of governance from standards, conformity, and control to accountability, extensibility, and enablement. Read more.
Add to your personal schedule
1:50pm2:30pm Wednesday, March 7, 2018
Location: 210 C/G Level: Beginner
John Mertic (The Linux Foundation), Maryna Strelchuk (ING)
John Mertic and Maryna Strelchuk detail the benefits of a vendor-neutral approach to data governance, explain the need for an open metadata standard, and share how companies like ING, IBM, Hortonworks, and more are delivering solutions to this challenge as an open source initiative. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 7, 2018
Location: 210 D/H Level: Beginner
Or Herman-Saffar (Dell), Ran Taig (Dell EMC)
What if we could predict when and where crimes will be committed? Or Herman-Saffar and Ran Taig offer an overview of Crimes in Chicago, a publicly published dataset of reported incidents of crime that have occurred in Chicago since 2001. Or and Ran explain how to use this data to explore committed crimes to find interesting trends and make predictions for the future. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 8, 2018
Location: 210 D/H Level: Beginner
Sugreev Chawla (Thorn)
Sugreev Chawla offers an overview of Spotlight, a tool created by Thorn, a nonprofit that uses technology to fight online child sexual exploitation. It allows law enforcement to process millions of escort ads per month in an effort to fight sex trafficking, using graph analysis, time series analysis and NLP techniques to surface important networks of ads and characterize their behavior over time. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 8, 2018
Location: LL20 D Level: Intermediate
Jennifer Prendki (Atlassian)
Jennifer Prendki explains how to develop machine learning models even if the data is protected by privacy and compliance laws and cannot be used without anonymizing, covering techniques ranging from contextual bandits to document vector representation. Read more.
Add to your personal schedule
1:50pm2:30pm Thursday, March 8, 2018
Location: 210 A/E Level: Intermediate
Mark Donsky (Cloudera)
In May 2018, the General Data Protection Regulation (GDPR) goes into effect for firms doing business in the EU, but many companies aren't prepared for the strict regulation or fines for noncompliance (up to €20 million or 4% of global annual revenue). Mark Donsky outlines the capabilities your data environment needs to simplify compliance with GDPR and future regulations. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 8, 2018
Location: LL21 E/F Level: Intermediate
Ajay Mothukuri (Sapient), Arunkumar Ramanatha (Sapient), Dr. Vijay Srinivas Agneeswaran (SapientRazorfish)
Ajay Mothukuri, Arunkumar Ramanatha, and Vijay Srinivas Agneeswaran explain how to use open source blockchain technologies such as Hyperledger to implement the European Union's General Data Protection Regulation (GDPR) regulation. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 8, 2018
Location: LL20 C Level: Intermediate
Pramit Choudhary (DataScience.com)
Pramit Choudhary explores the usefulness of a generative approach that applies Bayesian inference to generate human-interpretable decision sets in the form of "if. . .and else" statements. These human interpretable decision lists with high posterior probabilities might be the right way to balance between model interpretability, performance, and computation. Read more.