Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Schedule: Law, ethics, and governance sessions

Open data and heightened privacy concerns mean new, and often controversial, thinking on governance, ethics, and compliance, as well as a renegotiation of the pact we make with a life lived in public.

9:00am12:30pm Tuesday, March 6, 2018
Location: LL20 C
Mark Donsky (Okera), Andre Araujo (Cloudera), Syed Rafice (Cloudera), Mubashir Kazia (Cloudera)
Average rating: **...
(2.00, 1 rating)
New regulations are driving compliance, governance, and security challenges for big data, and infosec and security groups must ensure a consistently secured and governed environment across multiple workloads that span a variety of deployments. Mark Donsky, Andre Araujo, Syed Rafice, and Mubashir Kazia walk you through securing a Hadoop cluster, with special attention to GDPR. Read more.
9:00am5:00pm Tuesday, March 6, 2018
Location: LL20 B
David Boyle (Audience Strategies), Violeta Hennessey (Warner Bros.), April Chen (Civis Analytics), Sridhar Alla (BlueWhale), Noah Gift (UC Davis), Blake Irvine (Netflix), Kevin Lyons (Nielsen Marketing Cloud), Jennifer Webb (SuprFanz), Rizwan Patel (Caesars Entertainment), Anthony Accardo (Disney), Amanda Gerdes (Blizzard Entertainment), Violeta Hennessey (Warner Bros.), Aneesh Karve (Quilt), David Boyle (Audience Strategies), Pete Skomoroch (Workday)
Hear from innovators in ad tech, measurement, automation, and audience engagement about where the media industry is today—and where it's likely to go next. Read more.
11:00am11:40am Wednesday, March 7, 2018
Location: 210 C/G
Anne Buff (SAS)
Average rating: ****.
(4.50, 2 ratings)
Emerging technologies such as the IoT, AI, and ML present businesses with enormous opportunities for innovation, but to maximize the potential of these technologies, businesses must radically shift their approach to governance. Anne Buff explains what it takes to shift the focus of governance from standards, conformity, and control to accountability, extensibility, and enablement. Read more.
1:50pm2:30pm Wednesday, March 7, 2018
Location: 210 C/G
John Mertic (Linux Foundation), Maryna Strelchuk (ING)
John Mertic and Maryna Strelchuk detail the benefits of a vendor-neutral approach to data governance, explain the need for an open metadata standard, and share how companies like ING, IBM, Hortonworks, and more are delivering solutions to this challenge as an open source initiative. Read more.
2:40pm3:20pm Wednesday, March 7, 2018
Location: 210 D/H
Or Herman-Saffar (Dell), Ran Taig (Dell EMC)
Average rating: *....
(1.67, 3 ratings)
What if we could predict when and where crimes will be committed? Or Herman-Saffar and Ran Taig offer an overview of Crimes in Chicago, a publicly published dataset of reported incidents of crime that have occurred in Chicago since 2001. Or and Ran explain how to use this data to explore committed crimes to find interesting trends and make predictions for the future. Read more.
11:00am11:40am Thursday, March 8, 2018
Location: 210 D/H
Average rating: **...
(2.00, 1 rating)
Sugreev Chawla offers an overview of Spotlight, a tool created by Thorn, a nonprofit that uses technology to fight online child sexual exploitation. It allows law enforcement to process millions of escort ads per month in an effort to fight sex trafficking, using graph analysis, time series analysis, and NLP techniques to surface important networks of ads and characterize their behavior over time. Read more.
1:50pm2:30pm Thursday, March 8, 2018
Location: LL20 D
Jennifer Prendki (Figure Eight)
Average rating: ***..
(3.00, 1 rating)
Jennifer Prendki explains how to develop machine learning models even if the data is protected by privacy and compliance laws and cannot be used without anonymizing, covering techniques ranging from contextual bandits to document vector representation. Read more.
1:50pm2:30pm Thursday, March 8, 2018
Location: 210 A/E
Mark Donsky (Okera), Steven Ross (Cloudera)
In May 2018, the General Data Protection Regulation (GDPR) goes into effect for firms doing business in the EU, but many companies aren't prepared for the strict regulation or fines for noncompliance (up to €20 million or 4% of global annual revenue). Mark Donsky and Steven Ross outline the capabilities your data environment needs to simplify compliance with GDPR and future regulations. Read more.
2:40pm3:20pm Thursday, March 8, 2018
Location: LL21 E/F
Ajay Kumar Mothukuri (Sapient), Vijay Agneeswaran (Walmart Labs)
Ajay Mothukuri and Vijay Srinivas Agneeswaran explain how to use open source blockchain technologies such as Hyperledger to implement the European Union's General Data Protection Regulation (GDPR) regulation. Read more.
2:40pm3:20pm Thursday, March 8, 2018
Location: LL20 C
Pramit Choudhary (
Average rating: *****
(5.00, 3 ratings)
Pramit Choudhary explores the usefulness of a generative approach that applies Bayesian inference to generate human-interpretable decision sets in the form of "if. . .and else" statements. These human interpretable decision lists with high posterior probabilities might be the right way to balance between model interpretability, performance, and computation. Read more.