Big Data and the Social Firehose

Nick Halstead (DataSift)
Sponsored Session, Ballroom H

Social data is growing, Twitter produces 250+ million tweets per day and 27 million links to news and media. Big Data can give insights into these large datasets but first the data must be curated, cleaned and quantified before it has value. We will cover how we move from unstructured to structured and how we take simple data and apply complex processes to give context to the data.

We will cover how we developed a platform that can deal with billions of items per day and perform complex analysis before handing the data onto thousands of customers in real-time. We will also walk through our platform architecture looking at our use of Hadoop, HBase, 0MQ, Kafka and many other cutting edge technologies. You will learn some of the pitfalls of running a production Hadoop cluster and the value when you make it work.

Photo of Nick Halstead

Nick Halstead


Nick Halstead is the Founder of DataSift Inc., the real-time social media data-filtering platform. During the past five years, Nick has been a foremost technical visionary on the power of social data to revolutionize information delivery. Nick founded TweetMeme, the leading platform delivering social news, which quickly built an audience of millions in 30 countries. TweetMeme also invented the highly successful Retweet button, which serves more than 30 billion clicks per month and drives high volumes of traffic for Twitter. Nick is a regular speaker at events such as TechCrunch Disrupt, Le Web, Future of Web Apps, The Next Web and Strata and has spoken at SXSW and FOWA.


  • EMC
  • Microsoft
  • HPCC Systems™ from LexisNexis® Risk Solutions
  • MarkLogic
  • Shared Learning Collaborative
  • Cloudera
  • Digital Reasoning Systems
  • Pentaho
  • Rackspace Hosting
  • Teradata Aster
  • VMware
  • IBM
  • NetApp
  • Oracle
  • 1010data
  • 10gen
  • Acxiom
  • Amazon Web Services
  • Calpont
  • Cisco
  • Couchbase
  • Cray
  • Datameer
  • DataSift
  • DataStax
  • Esri
  • Facebook
  • Feedzai
  • Hadapt
  • Hortonworks
  • Impetus
  • Jaspersoft
  • Karmasphere
  • Lucid Imagination
  • MapR Technologies
  • Pervasive
  • Platform Computing
  • Revolution Analytics
  • Scaleout Software
  • Skytree, Inc.
  • Splunk
  • Tableau Software
  • Talend

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners

For media-related inquiries, contact Maureen Jennings at

View a complete list of Strata contacts