Using big data effectively almost always involves large amounts of cleaning and processing. Proper categorization and attribute labels are essential. In many cases some of the steps can only be done manually making crowdsourcing a crucial tool for data scientists.
This talk will describe micotasking, where it fits in the crowdsourcing landscape, and how data scientists and developers can most effectively tap into the crowd to collect and process their data sets. Several real world cases will be used to illustrate the possibilities, including tweet analysis, social profile mining and pre-processing satellite imagery for big data queries. In this talk I will also take a stab at predicting where the state of the art will be a year from now.
Lukas Biewald is the founder and CEO of CrowdFlower. Founded in 2007, CrowdFlower provides Labor-on-Demand to help companies outsource high-volume, repetitive tasks to a massively-distributed global workforce.
Before founding CrowdFlower, Lukas was a senior scientist and manager within the Ranking and Management Team at Powerset, Inc., acquired by Microsoft in 2008. He led the Search Relevance Team for Yahoo! Japan after graduating from Stanford University with a B.S. in Mathematics and an M.S. in Computer Science. Recently, Lukas won the Netexplorateur Award for GiveWork – a collaboration with Samasource that brings digital work to refugees worldwide. Lukas is also an expert level Go player.
For exhibition and sponsorship opportunities, contact Susan Stewart at email@example.com
For information on trade opportunities with O'Reilly conferences, email firstname.lastname@example.org
For media-related inquiries, contact Maureen Jennings at email@example.com
View a complete list of Strata contacts