Presented By
O’Reilly + Intel AI
Put AI to Work
April 15-18, 2019
New York, NY

Executive Briefing: Overview of Data Governance

Paco Nathan (
4:05pm4:45pm Thursday, April 18, 2019
Secondary topics:  AI in the Enterprise, Automation in machine learning and AI, Data and Data Networks

Who is this presentation for?

managers, execs, team leads for data-related organizations



Prerequisite knowledge

some experience in analytics as well as business management

What you'll learn

history, themes, and current drivers regarding data governance in industry survey of tools, vendors, process, standards, open source projects, etc. interviews with experts about issues and best practices impact of security concerns, ethics and bias in ML, highly regulated environments, "democratizing data", and workflow reproducibility what impact does machine learning have on data governance and vice versa? risk management plays the "thin edge of the wedge" for these changes in enterprise how does the emerging "Chief Data Officer" role fit?


Data governance presents an almost overwhelming topic, one which touches on so many aspects of data and analytics in enterprise. This talk surveys the history, themes, and current drivers regarding data governance in industry, as well as a survey of tools, vendors, process, standards, open source projects, etc. — partly based on interviewing experts in this field about issues and best practices.

On the one hand, there’s the angle that poor data governance can lead to system data quality issues, lack of data availability, and other risks which mean that the people within an organization cannot leverage data effective, thus limiting ROI. On the other hand, there’s a flip side which presents a galaxy of compliance issues, aimed at preventing risks if people leverage data inappropriately. While many are quick to mention GDPR, there are many standards in play, depending on the business vertical.

However, several other issues create drivers for data governance: how security concerns are reshaping the structure of web apps; ethics, bias, and others needs for ML transparency; the implications of “democratizing data and analytics”, both pro and con; the priority for reproducibility in analytics workflows; and unexpected ways in which open source is evolving rapidly in highly regulated environments.

Although many practices emerged from the era of data warehouses, Big Data changed the game — and began drawing attention from regulators. Facebook, Twitter, and other tech giants now testify before US senators, who in turn struggle to grasp basic concepts in IT. Meanwhile the IT landscape is evolving rapidly: new forms of hardware and networking, serverless cloud offerings, edge computing, etc., which redefine even the basic concepts related to data governance.

Effective data governance is foundational for AI adoption in enterprise — that’s proving to be table stakes. Some have even begun to describe AI in terms of building capital stock. What impact does machine learning have on data governance and vice versa?

Ultimately, risk management plays the “thin edge of the wedge” for these changes in enterprise, while the mantle of responsibility for data governance moves toward the emerging Chief Data Officer role.

Photo of Paco Nathan

Paco Nathan

Paco Nathan is known as a “player/coach” with core expertise in data science, natural language processing, machine learning, and cloud computing. He has 35+ years of experience in the tech industry, at companies ranging from Bell Labs to early-stage startups. Paco is co-chair of JupyterCon and Rev, and an advisor for Amplify Partners, Deep Learning Analytics, Recognai, Data Spartan. Recent roles include director of the Learning Group at O’Reilly Media and director of community evangelism at Databricks and Apache Spark. In 2015 he was named one of the top 30 people in big data and analytics by Innovation Enterprise.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)