Strata in London 2013 Schedule

Below are the confirmed and scheduled talks at Strata in London 2013 (schedule subject to change).

Customize Your Own Schedule

Create your own Strata schedule using the personal scheduler function. Mark the sessions, keynotes, and events you want to attend by selecting the calendar icon [calendar icon] next to each listing. Then go to your personal schedule and get your own customized schedule generated.

King's Suite - Balmoral
Add Bootstrapping Graph Search to your personal schedule
11:20 Bootstrapping Graph Search Ian Hegerty (Facebook)
Add How Stuff Spreads to your personal schedule
13:15 How Stuff Spreads Francesco D'Orazio (FACE)
Add Fixing the Trouble with Things to your personal schedule
16:25 Fixing the Trouble with Things Alasdair Allan (Babilim Light Industries)
Add Data Science and the Internet of Things - Deep Analytics on Traffic Data to your personal schedule
17:15 Data Science and the Internet of Things - Deep Analytics on Traffic Data Alexander Kagoshima (Pivotal), Noelle Sio (Pivotal)
King's Suite - Sandringham
Add Behind the Data Sensing Lab: Building a Platform for Highly Scalable, Rapid Data Collection and Analysis to your personal schedule
13:15 Behind the Data Sensing Lab: Building a Platform for Highly Scalable, Rapid Data Collection and Analysis Amy Unruh (Google), Felipe Hoffa (Google), Alasdair Allan (Babilim Light Industries)
Add Cascalog for Data Scrubbing to your personal schedule
14:05 Cascalog for Data Scrubbing Bruce Durling (Mastodon C)
Add Making text classification trivial - combining Apache Lucene and Mahout to your personal schedule
16:25 Making text classification trivial - combining Apache Lucene and Mahout Isabel Drost (Apache Software Foundation/ Nokia Gate 5 GmbH)
Palace Suite - Buckingham Room
Add Human Progress: D3.js and a responsive design show how we're getting better all the time  to your personal schedule
11:20 Human Progress: D3.js and a responsive design show how we're getting better all the time Marc marc.garrett@gmail.com (Intridea), Maggie Lubberts (Intridea)
Add The Power Of Visualizing Deforestation Data to your personal schedule
13:15 The Power Of Visualizing Deforestation Data Andrew Hill (Set), Robin Kraft (World Resources Institute), Javier de la Torre (Vizzuality)
Add Ships Around the World: Processing and Visualizing Large Global Datasets with the Google Cloud to your personal schedule
14:05 Ships Around the World: Processing and Visualizing Large Global Datasets with the Google Cloud Mano Marks (Google, Inc. ), Kurt Schwehr (Google, Inc.)
Add Creating big data journalism impact with tiny resources to your personal schedule
15:35 Creating big data journalism impact with tiny resources Claire Miller (Trinity Mirror Regionals)
Add Data Journalism in Mexico: Promote among journalists use open data in the service of citizens. to your personal schedule
16:25 Data Journalism in Mexico: Promote among journalists use open data in the service of citizens. Lilia Saul (México Infórmate), Gabriela Morales (México Infórmate)
Add BigData London Meetup at Strata (Community Event) to your personal schedule
19:00 Plenary
Room: Palace Suite - Buckingham Room
BigData London Meetup at Strata (Community Event)
Palace Suite - Blenheim Room
Add Data as an Art Material. Case Study: The Open Data Institute to your personal schedule
13:15 Data as an Art Material. Case Study: The Open Data Institute Julie Freeman (Queen Mary University of London)
Add No More Crayons to your personal schedule
14:05 No More Crayons Simon Everest (Government Digital Service)
Add Future Cities - data science in urban infrastructure to your personal schedule
15:35 Future Cities - data science in urban infrastructure Simon Williams (QuantumBlack)
Add Fixing the Economy through Data Science to your personal schedule
16:25 Fixing the Economy through Data Science Stian Westlake (Nesta), Louise Marston (Nesta), Hasan Bakhshi (Nesta)
Add Hadoop User Group UK Meetup at Strata (Community Event) to your personal schedule
19:00 Plenary
Room: Palace Suite - Blenheim Room
Hadoop User Group UK Meetup at Strata (Community Event)
Westminster
Add How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron & Big Data. to your personal schedule
13:15 How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron & Big Data. Steven Totman (Syncsort), Matt Brandwein (Cloudera)
Add A New Paradigm in Cross Domain Data Management to your personal schedule
14:05 A New Paradigm in Cross Domain Data Management Brian Knox (Talksum, Inc.)
Add Seven Fun Things To Do With MapReduce to your personal schedule
15:35 Seven Fun Things To Do With MapReduce Christopher Hillman (Teradata)
Add Welcome and Announcements to your personal schedule
9:00 Plenary
Room: King's Suite
Welcome and Announcements Edd Wilder-James (Google), Kaitlin Thaney (Mozilla Science Lab)
Add The Future of Data to your personal schedule
9:10 Plenary
Room: King's Suite
The Future of Data Doug Cutting (Cloudera)
Add The Analytical Imperative to your personal schedule
9:30 Plenary
Room: King's Suite
The Analytical Imperative Duncan Ross (TES Global)
Add Spreadsheets: The Ununderstood Dark Matter Of IT to your personal schedule
9:40 Plenary
Room: King's Suite
Spreadsheets: The Ununderstood Dark Matter Of IT Felienne Hermans (Delft University of Technology)
Add Introducing Dat: If Git Were Designed For Big Data to your personal schedule
9:55 Plenary
Room: King's Suite
Introducing Dat: If Git Were Designed For Big Data Max Ogden (Independent)
Add Data nerding in public health to your personal schedule
10:10 Plenary
Room: King's Suite
Data nerding in public health Francine Bennett (Mastodon C)
Add Demonstrating The Actual Economic Value of Data to your personal schedule
10:25 Plenary
Room: King's Suite
Demonstrating The Actual Economic Value of Data Tim Kelsey (National Health Service)
Add Attendee Reception/Startup Showcase to your personal schedule
18:00 Event - Sponsored by Teradata
Room: Sponsor Pavilion
Attendee Reception/Startup Showcase Eileen Burbidge (Passion Capital), Alexandra Deschamps Sonsino (Good Night Lamp), JP Rangaswami (Salesforce)
10:50 Morning Break - Sponsored by MammothDB
Room: Monarch Suite
12:00 Lunch - Sponsored by Teradata
Room: Monarch Suite
Add Afternoon Break / Startup Showcase to your personal schedule
14:45 Afternoon Break/ Startup Showcase - Sponsored by MammothDB
Room: Monarch Suite
Afternoon Break / Startup Showcase
8:00 Morning Coffee Service
Room: King's Suite Foyer
11:20-12:00 (40m) Business & Industry, Data Science, Open Data
Bootstrapping Graph Search
Ian Hegerty (Facebook)
In January Facebook launched Graph Search in the US which allows users to search their social graph. Ian Hegerty will describe how the Graph Search corpus was built from Facebook's entity graph, and how big data is used to understand users queries and provide relevant results, with minimal initial user behavioral data.
13:15-13:55 (40m) Business & Industry, Data Science, Design, Tools & Technology
How Stuff Spreads
Francesco D'Orazio (FACE)
How Stuff Spreads looks at how two recent memes spread online: Gangnam Style vs Harlem Shake. The talk dissects the memes through the lens of big data to show what made them go viral, what do they have in common, how quantitative and qualitative analysis have to come together to craft insights and tell a story, and finally how to predict future memes and create a data-driven content strategy.
14:05-14:45 (40m) Business & Industry, Data Science, Tools & Technology
Data Science Provenance: from drug discovery to fake fans
Jameel Syed (-)
How do we know what we know? Increasingly discoveries are made from computed data, possibly sourced from the internet. If we are to trust these discoveries, how conclusions are reached is critical. Examples from work in Big Data analytics infrastructure for life sciences and social media analysis will illustrate the key issues.
15:35-16:15 (40m) Data Science, Design, Tools & Technology
Changing the nature of health care with sensors and data science
yodit stanton (opensensors.io)
Medical treatments have have come a long away in the last couple of decades. On the other hand, we could be doing a lot better in monitoring people within their own homes between hospital visits using sensors. Sensors combined with Big Data technologies are set to bring about profound changes for the future of health and social care.
16:25-17:05 (40m) Data Science
Fixing the Trouble with Things
Alasdair Allan (Babilim Light Industries)
Everyday objects are becoming smarter. In ten years’ time, every piece of clothing you own, every piece of jewelry, and every thing you carry with you will be measuring, weighing and calculating your life. In ten years, the world — your world — will be full of sensors.
17:15-17:55 (40m) Data Science
Data Science and the Internet of Things - Deep Analytics on Traffic Data
Alexander Kagoshima (Pivotal), Noelle Sio (Pivotal)
In the future we will see huge growth in the amount of traffic data generated through built-in car sensors. This talk presents a case study of analytics on traffic and traffic light data. Methods will be presented that yield a deep understanding of traffic and its characteristics by analyzing past traffic data. These methods could be extended to predict traffic jams and optimize routing systems.
11:20-12:00 (40m) Data Science, Tools & Technology
Spark and Shark: High-Speed Analytics over Hadoop and Hive Data
Patrick Wendell (Databricks)
As big data analytics evolves beyond simple batch jobs, there is a need for both lower-latency processing (interactive queries and steam processing) and more complex analytics (e.g. machine learning, graph algorithms). This talk will introduce Spark and Shark, popular open source projects from Berkeley that address this need through an optimized runtime engine and in-memory computing capabilities.
13:15-13:55 (40m) Tools & Technology
Behind the Data Sensing Lab: Building a Platform for Highly Scalable, Rapid Data Collection and Analysis
Amy Unruh (Google), Felipe Hoffa (Google), Alasdair Allan (Babilim Light Industries)
In May 2013, the O'Reilly Data Sensing Lab collaborated with the Google Cloud Platform and Device Cloud by Etherios, to deploy a network of hundreds of environmental sensors at Google I/O. Learn how the Google Cloud Platform was used to build an end-to-end, scaleable, and high-throughput pipeline for data collection, processing, and analysis.
14:05-14:45 (40m) Open Data, Tools & Technology
Cascalog for Data Scrubbing
Bruce Durling (Mastodon C)
It has been said by many that 80% data science is scrubbing data. In this talk we'll cover how you can use Cascalog to scrub, transform, manipulate and mangle data into the formats you need, fix things that are wrong and filter out things that are broken. Clojure and Cascalog together provide fantastic tools for this. Learn about using Hadoop with the messy data that exists in the real world.
15:35-16:15 (40m) Data Science
Online Machine Learning With Distributed In-memory Clusters
Arshak Navruzyan (Argyle Data)
Fast read and write performance and scalability of distributed in-memory clusters is making it possible to retrain machine learning algorithms in real-time. The application of such algorithms to risk, infrastructure security and other areas can be transformative.
16:25-17:05 (40m) Tools & Technology
Making text classification trivial - combining Apache Lucene and Mahout
Isabel Drost (Apache Software Foundation/ Nokia Gate 5 GmbH)
"In order to classify documents, simply first convert them to vectors, train, test and finally apply the model." Sounds easy - in theory. Converting documents to vectors usually is the tricky part. This talk walks you through the steps necessary to convert your text documents into feature vectors that Mahout classifiers can use including a few anecdotes on drafting domain specific features.
17:15-17:55 (40m) Business & Industry
Lessons from High-Frequency Trading - How Enterprise Data Is Collected and Analyzed in Real Time
Volkmar Uhlig (Adello)
The value of machine-generated data is highest at the moment it is generated. Operational data, sensor data and video feeds require new automated approaches to maximize the value of these streams of data. In this session, Dr. Volkmar Uhlig will explore how to apply the lessons learned from automated high-frequency trading systems to today’s Big Data problems to monetize information.
11:20-12:00 (40m) Data Science, Design, Open Data, Tools & Technology
Human Progress: D3.js and a responsive design show how we're getting better all the time
Marc marc.garrett@gmail.com (Intridea), Maggie Lubberts (Intridea)
We're getting better all the time. See how the Cato Institute used responsive design and D3.js to show how human development indicators improve as economic freedom spreads.
13:15-13:55 (40m) Data Science
The Power Of Visualizing Deforestation Data
Andrew Hill (Set), Robin Kraft (World Resources Institute), Javier de la Torre (Vizzuality)
Maps are powerful tools for people to learn from data. In this project, we combine large-scale data processing with Hadoop and data visualization through CartoDB to make over six years of bi-monthly deforestation data accessible in an interactive map on the web. This talk will tell the story of how large-scale data paired with visualization can make data accessible in important new ways.
14:05-14:45 (40m) Business & Industry, Data Science, Tools & Technology
Ships Around the World: Processing and Visualizing Large Global Datasets with the Google Cloud
Mano Marks (Google, Inc. ), Kurt Schwehr (Google, Inc.)
Many big data solutions focus on large data analysis that happens in data centers. Or they focus on data visualization in the browser. When you combine both of these techniques, you get amazing and expressive power. This talk will show how to use the Google Maps API with WebGL and Google Big Query, Cloud Storage, App Engine and Compute Engine to deliver amazing, responsive visualizations.
15:35-16:15 (40m) Data Science, Open Data
Creating big data journalism impact with tiny resources
Claire Miller (Trinity Mirror Regionals)
How do you do data journalism when you are not the Guardian, the New York Times or the Washington Post? You don't need a data team, developers, much time or any funding to get started and produce data journalism that grabs headlines and engages readers. This workshop will focus on quick start techniques for getting started and making the most of few resources.
16:25-17:05 (40m) Open Data
Data Journalism in Mexico: Promote among journalists use open data in the service of citizens.
Lilia Saul (México Infórmate), Gabriela Morales (México Infórmate)
In Mexico open data journalism is still difficult to exercise. Find In Mexico, an organization to which I belong, we developed a research project in which we use databases and the result was a map (www.mexicoinformate.org/platform) that has managed to draw attention to the slow progress of the law enforcement system in the country.
17:15-17:55 (40m) Business & Industry, Ethics, Policy & Privacy, Open Data
How OpenCorporates built the world's biggest open database of companies in just 2 years, 2 people, and under £50k
Chris Taggart (OpenCorporates)
Since it launched just 2 years ago, it has leveraged the open data community to grow to by far the largest open database of companies in the world with over 50 million companies in 70 jurisdictions, and is regularly used by journalists, anti-corruption investigators, civil society, even banks and financial institutions.
19:00-21:10 (2h 10m) Event
BigData London Meetup at Strata (Community Event)
A special edition of the Big Data London Meetup will take place at Strata Conference venue on the evening of day 1, promising for a great crowd and amazing talks.
11:20-12:00 (40m) Data Science
Finding great properties: How Airbnb uses open-source technology and analytics to deliver meaningful experiences
Jan Overgoor (Airbnb)
For a two-sided marketplace like Airbnb, the search engine is the main driver of the health of the business. We developed an open-source technology stack and a set of analytical methods to optimize the search experience for our users and search conversion for our business. We’ll discuss the tools we use for data crunching, analysis and reporting, as well as our thoughts on experimental design.
13:15-13:55 (40m) Design, Open Data
Data as an Art Material. Case Study: The Open Data Institute
Julie Freeman (Queen Mary University of London)
A vending machine that dispenses crisps in response to financial news, laser sketches from 4000 years of solar eclipse data, and a mural created from QR codes acting as cellular automata. Featuring a diverse exploration of data, the ODI will showcase art commissioned and curated for its on-going Data as Culture programme which facilitates artists to explore data as an art material.
14:05-14:45 (40m) Business & Industry
No More Crayons
Simon Everest (Government Digital Service)
Farmers in the UK receive subsidies via the Common Agricultural Policy. These form a significant part of their income, approx £2billion/yr total. Changes to the data driving the payments are submitted by farmers as hand drawn imagery. I'll share what the Government Digital Service has learned working with Defra, prototyping ways to capture changes that are sustainable, cheaper and more accurate.
15:35-16:15 (40m) Business & Industry, Data Science
Future Cities - data science in urban infrastructure
Simon Williams (QuantumBlack)
Crossrail will help deliver a new London. It is one of the largest civil engineering projects -- taking place literally under the feet of Strata London. We'll present how data science is being deployed at Crossrail to fundamentally change the way decisions are made and the operation is being run; from the CEO to the engineers monitoring ground movement in the tunnels.
16:25-17:05 (40m) Data Science, Ethics, Policy & Privacy
Fixing the Economy through Data Science
Stian Westlake (Nesta), Louise Marston (Nesta), Hasan Bakhshi (Nesta)
The economy is in a mess. But good data can help fix it. Timely analysis of large data sets is beginning to provide insight into what's really happening to business growth, employment and prosperity. We'll look at some of the most exciting examples of how Big Data is changing the way we look at the economy, and how governments and businesses can use them to their advantage.
17:15-17:55 (40m) Business & Industry
Big Data for Big Power: How smart is the grid if the infrastructure is stupid?
Brett Sargent (LumaSense Technologies Inc.)
The "smart grid" isn't just about smart meters. Every stage of our electrical power infrastructure has to be "smart," including generation, transmission and distribution. Sophisticated sensors connected to software platforms that continuously gather, visualize and analyze massive amounts of data in real time to produce actionable insights are critical to optimizing our energy assets.
19:00-21:00 (2h) Event
Hadoop User Group UK Meetup at Strata (Community Event)
Big data & Hadoop fans can come together at this special Hadoop Users Group UK meetup at Strata - mingle with fellow Hadoopers for food, drinks, discussions, and networking.
11:20-12:00 (40m) Sponsored
Where Polyglot Persistence meets the Lambda Architecture
Michael Hausenblas (Red Hat)
I will discuss and showcase polyglot persistence and the lambda architecture. Based on real-world examples and case studies from our customer base and the wider community of practitioners, incl. the financial, energy & media industries and the realm of public data sources (government data), we will elaborate on opportunities and challenges of these new data management and processing memes.
13:15-13:55 (40m) Sponsored
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron & Big Data.
Steven Totman (Syncsort), Matt Brandwein (Cloudera)
Mainframe is Big Data too! Leveraging it in Hadoop creates a remarkable competitive advantage, but exploiting it without the right tools is nearly impossible, requiring you to wrestle with thousands of lines of Java, Pig, Hive, COBOL and more. This session presents a smarter way to ingest and process mainframe data in Hadoop, and how to bridge the technical, skill and cost gaps between the two.
14:05-14:45 (40m) Sponsored
A New Paradigm in Cross Domain Data Management
Brian Knox (Talksum, Inc.)
In the era of M2M Communication and the Internet of Things on top of traditional 3V's of Big Data - Volume, Variety and Velocity we need to be able to process ephemeral data produced by dispersed sources which needs to be organized and distributed to multiple services. Real-time response, security, compliancy, compatibility, breaking silos - require new approaches to data management.
15:35-16:15 (40m) Sponsored
Seven Fun Things To Do With MapReduce
Christopher Hillman (Teradata)
In this presentation Chris will look at seven Map Reduce techniques that he enjoys playing with – the bits that are fun, exciting, and that can provide valuable insight into your big data. With examples of code (and where you can look for it) for web, image and text processing he'll show things that can quickly allow you to extend your analysis beyond traditional data mining.
16:25-17:05 (40m) Sponsored
Turn Hadoop Data into Business Insights: A New Approach for Rapid Exploration and Analysis
Brett Sheppard (Splunk)
How can business and IT users easily explore, analyse and visualise data in Hadoop? Learn about alternatives to manually writing jobs or setting up predefined schemas and how a leading enterprise used Splunk and their Hadoop distribution to empower them with new access to Hadoop data. See how they got up and running in under an hour and enabled their developers to start writing big data apps.
9:00-9:10 (10m)
Welcome and Announcements
Edd Wilder-James (Google), Kaitlin Thaney (Mozilla Science Lab)
Program Chairs, Edd Dumbill and Kaitlin Thaney, open the first day of keynotes.
9:10-9:30 (20m) Data Science
The Future of Data
Doug Cutting (Cloudera)
As technology further pervades enterprises, each generates more data. Once harnessed, this data can enhance business, enabling growth. A new home for data has arrived to better support this: the Enterprise Data Hub, with Apache Hadoop at its center. Doug will discuss the trends that drive this and speculate on where they lead.
9:30-9:40 (10m) Sponsored
The Analytical Imperative
Duncan Ross (TES Global)
Big data has proved it's worth in a number of industries, but it's not the size or the storage that is making the difference. The organisations that are delivering most value are the ones that have realised the need to drive analytics into the heart of their decision making process.
9:40-9:55 (15m)
Spreadsheets: The Ununderstood Dark Matter Of IT
Felienne Hermans (Delft University of Technology)
In this talk Felienne will summarize her recently completed PhD research on the topic of spreadsheet structure visualization, spreadsheet smells and clone detection, as well as presenting a sneak peek into the future of spreadsheet research as Delft University.
9:55-10:10 (15m) Data Science
Introducing Dat: If Git Were Designed For Big Data
Max Ogden (Independent)
Dat aims to bring a distributed collaboration flow to big data. Git and Github have done it for source code, but we don't yet have a social data solution.
10:10-10:25 (15m) Business & Industry, Data Science, Open Data
Data nerding in public health
Francine Bennett (Mastodon C)
The NHS produces an amazing amount of detailed raw data about health, prescribing, doctors, hospitals, and so on. The data's a great resource for data scientists to experiment with and learn on - it's very rich, interesting, and important to society. This session will discuss the available datasets and work through some example analyses of the data from different perspectives.
10:25-10:50 (25m)
Demonstrating The Actual Economic Value of Data
Tim Kelsey (National Health Service)
Keynote by Tim Kelsey, National Director for Patients and Information, National Health Service.
18:00-19:00 (1h)
Attendee Reception/Startup Showcase
Eileen Burbidge (Passion Capital), Alexandra Deschamps Sonsino (Good Night Lamp), JP Rangaswami (Salesforce)
Grab a drink, mingle with fellow attendees, and see the latest in big data technologies and products from leading companies at the Attendee Reception - happening Monday evening immediately following afternoon sessions. We'll also continue hosting Startup Showcase during the reception.
10:50-11:20 (30m)
Break: Morning Break - Sponsored by MammothDB
12:00-13:15 (1h 15m)
Break: Lunch - Sponsored by Teradata
14:45-15:35 (50m)
Afternoon Break / Startup Showcase
Startup Showcase will kick off during the afternoon break on Monday, and continue again during the Attendee Reception--all held at the Sponsor Pavilion.
8:00-9:00 (1h)
Break: Morning Coffee Service

Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences email mediapartners
@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts