Liberating Data @ Wikipedia

Diederik van Liere (Wikimedia Foundation)
Location: Online Level:

Wikipedia’s editors and readers create a digital footprint of who we are, which is not just defined by the articles that editors write, but also what the Wikipedia readers read and how editors interact with each other. This results in tons of data, most of which is publicly available. The data include the actual contents of all the articles, as well as data about pageviews, data about policy making, and data about editor interactions.

The purpose of this presentation is to give a quick overview of the data sources that are available for free, the tools that the Wikimedia Foundation is developing to analyze these datasets, and how we want to attract a new audience of data lovers and data geeks to help us expanding our understanding of Wikipedia as a micro-cosmos.

Diederik van Liere works as the Product Manager Analytics at the Wikimedia Foundation. The Wikimedia Foundation is a nonprofit charitable organization dedicated to encouraging the growth, development and distribution of free, multilingual content, and to providing the full content of these wiki-based projects to the public free of charge. The Foundation operates some of the largest collaboratively edited reference projects in the world, including Wikipedia, a top-ten internet property.

Photo of Diederik van Liere

Diederik van Liere

Wikimedia Foundation

PhD by training, data scientist by passion. Currently working for the Wikimedia Foundation to build a community analytics platform.