This talk is based on recent papers we published in Nature srep on the privacy of mobility data and in IEEE Data Engineering on openPDS and the New Deal on Data. This work, done at the MIT Media Lab and Louvain, shows that location data might not be as anonymous as we think. At a time where tremendous amounts of user data are becoming available, understanding the limits of individual’s privacy will be crucial in the design of both future policies and information technologies. This work has been covered in WEFBBC, CNN, GigaOm, Wired, Technology Review, etc.
In this talk, I will show how 4 points-approximate places and times-are enough to identify 95% of individuals in a mobility database of 1.5 million people and 15 month. What this result means is that identifying people in a large-scale location database is likely to be easy even though no “private” information such as names, e-mails or phone numbers was ever collected. Location data thus truly acts a fingerprint. This digital fingerprint turns out to be more unique than the traditional physical fingerprint.
I will further show how human behavior puts fundamental constraints to the privacy of individuals. What is particularly interesting here is that these constraints hold even when the resolution of the dataset is low. This means that even coarse datasets provide little anonymity. Using these large-scale data, I will show how we derived a formula to estimate the uniqueness of human mobility traces. This formula can be used as a rule of thumb to estimate the privacy of a dataset knowing its spatial and temporal resolution.
These data is however of great value and all of us; users, companies and scientists have a lot to gain from its uses. There is far more to location data than just privacy concerns. It is therefore of tremendous importance to understand how to use this data while preserving people’s privacy. I hope this talk to help attendees understand what is possible and what is not possible when it comes to privacy. I will conclude by discussing some of the legal and technical solutions we are currently developing at the Media Lab.
Yves-Alexandre de Montjoye is researcher at the MIT Media Lab where he is engineering stochastic tools to harness the power of rich behavioral datasets, such as human movement data and communication patterns in networks. He is also interested in how the unicity of human behavior and the richness of these datasets impact individual’s privacy. His research has been covered in BBC News, CNN, The New York Times, MIT Technology Review, Wired, and The Huffington Post. Before coming to MIT, he was a researcher at the Santa Fe Institute where he used cell phone data to model the dynamics of social support. Over a period of 6 years, he obtained an MSc in applied mathematics and his BSC in engineering from Louvain; an MSc (Centralien) from Ecole Centrale Paris; and an MSc from KULeuven in mathematical engineering.
For exhibition and sponsorship opportunities, email firstname.lastname@example.org
For information on trade opportunities with O'Reilly conferences, email email@example.com
For media-related inquiries, contact Maureen Jennings at firstname.lastname@example.org
View a complete list of Strata + Hadoop World contacts
©2015, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.