Presented By O'Reilly and Cloudera
Make Data Work
22–23 May 2017: Training
23–25 May 2017: Tutorials & Conference
London, UK

Open corporate ownership data

Emma Deraze (DataKind UK)
16:3517:15 Wednesday, 24 May 2017
Law, ethics, governance
Location: Capital Suite 14
Level: Non-technical

Who is this presentation for?

  • Journalists, government officials, data scientists, data analysts, and lawyers

Prerequisite knowledge

  • General familiarity with databases and data querying

What you'll learn

  • Explore a collaborative project between DataKind, Global Witness, and Open Corporates to analyze open UK corporate ownership data
  • Understand the hurdles to a consistent, international database of corporate data due to the variety of jurisdictions and approaches to data gathering


DataKind brings together skilled data scientists with social change organizations to collaborate on cutting-edge analytics to maximize social impact. Emma Deraze explores a collaborative project between DataKind, Global Witness, and Open Corporates to analyze open UK corporate ownership data and presents findings and insights into the challenges facing open official data, specifically in the context of an international setting, such as complex corporate networks.

The project examined beneficial ownership data, the first investigation of this dataset as this was the first year the UK government collected it. The project had a twofold aim: to both better understand the UK corporate ownership network and assess the quality of the data itself, possibly leading to suggestions for ways to more accurately collect it. Researchers also wanted to define red flags and create an automated system for identifying them, leading to a better transparency of ownership. The analysis included building a graph database of the dataset and attempts at entity disambiguation (including dealing with free-text country fields).

Emma also offers a more general view of the issues of open official data: How open is it, really? How useful is it to the general public? And how open can potentially sensitive data with privacy requirements actually be? The project’s analysis uncovered many issues with the data that would become huge hurdles to anyone looking to use it for investigative purposes (and this is a dataset at the national level, which has obvious limitations for a subject such as corporate ownership, which very quickly becomes international). Emma covers these conceptual problems and briefly discusses organizations, such as the OCCRP and the ICIJ, that are actively trying to address them.

Photo of Emma Deraze

Emma Deraze

DataKind UK

Emma Deraze is a data scientist with TES Global and a volunteer at DataKind UK.