Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA
Please log in

The Paradise Papers and West Africa Leaks: Behind the scenes with the ICIJ

Pierre Romera (International Consortium of Investigative Journalists (ICIJ))
3:50pm4:30pm Thursday, March 28, 2019
Average rating: ****.
(4.67, 6 ratings)

Who is this presentation for?

  • Investigative journalists, data scientists, engineers, developers, and senior-level executives



What you'll learn

  • Explore data visualization technologies that allow for effective investigative journalism
  • See how the ICIJ used enterprise software beyond traditional business analytical needs and for social good
  • Discover how computing technologies can create a larger impact when they are used by a wide variety of people across industries


It’s a simple fact that all manner of computing technologies, including data, code, and algorithms, wield real power in our society. Journalists need to be able to interrogate that power, questioning both the virtual and physical systems shaping the world around them. This questioning extends to the very networks used to circulate journalistic work, giving groups outsized voices and sometimes leading to the proliferation of misinformation. All of this is in response to the availability and popularization of software systems by nonexperts. Access to simple computational techniques has forever changed not only the subjects of reporting but also the way in which journalism is practiced. Journalists now make use of tools for large-scale data analysis that were previously only accessible to technical experts.

In November of 2017, the International Consortium of Investigative Journalists (ICIJ) published the Paradise Papers, a yearlong investigation on the offshore dealings of multinational companies and wealthy individuals around the globe. Four hundred journalists from more than 60 countries worked together, sifting through 13 million documents to uncover stories about the Canadian prime minister’s chief fundraiser and jets from the Isle of Man, among other things. In May of 2018, the ICIJ released additional findings, titled the “West Africa Leaks,” to shine a light on abuses of trust and ethics by business and political leaders across 15 countries occupying the west coast of the impoverished continent of 367 million people.

ICIJ’s CTO Pierre Romera explains how his team made sense of the massive amounts of data, using used the following tools:

  • Neo4j (data storage, graph database)
  • Apache Tika (data and metadata extraction)
  • Apache Solr (indexing)
  • Blacklight (user interface)
  • Tesseract (optical character recognition (OCR))
  • Talend (data extraction, transformation, and loading (ETL))
  • Linkurious (user interface, visualization)
  • PGP (secure communication)
  • Signal (secure communication)
  • Internally created tools, including customization of other tools to enhance security
Photo of Pierre Romera

Pierre Romera

International Consortium of Investigative Journalists (ICIJ)

Pierre Romera is the chief technology officer at the International Consortium of Investigative Journalists (ICIJ), where he manages a team of programmers working on the platforms that enabled more than 300 journalists to collaborate on the Paradise Papers and Panama Papers investigations. Previously, he cofounded Journalism++, the Franco-German data journalism agency behind the Migrant Files, a project that won the European Press Prize in 2015 for Innovation. He is one of the pioneers of data journalism in France.