Strata in London 2012 Schedule

Below are the confirmed and scheduled talks at Strata in London 2012 (schedule subject to change).

Customize Your Own Schedule

Create your own Strata schedule using the personal scheduler function. Mark the sessions, keynotes, and events you want to attend by selecting the calendar icon [calendar icon] next to each listing. Then go to your personal schedule and get your own customized schedule generated.

10:45 Morning Break
Room: Monarch Suite
12:10 Lunch
Room: Monarch Suite
Tuesday Lunchtime BoFs
14:55 Afternoon Break
Room: Monarch Suite
Buckingham Room
11:45 Open Data - a new tool for government John Sheridan (The National Archives), Jeni Tennison (Open Data Institute)
14:00 Machine as Collaborator JD Vogt (Salesforce)
14:30 clearScience: Dragging Scientific Communication into the Information Age Brian Bot (Sage Bionetworks), Erich Huang (Sage Bionetworks)
16:00 Best practices for publishing data Hjalmar Gislason (DataMarket)
16:30 Democratising data journalism with the Miso project Irene Ros (Bocoup), Alastair Dant (Guardian News and Media), Alex Graul (Guardian News & Media)
17:00 Zooniverse: Web-scale Citizen Science Arfon Smith (University of Oxford)
18:30 Plenary
Room: Buckingham Room
Data Science London Meetup (Community Event)
Bleinheim Room (Sponsored)
16:30 New Opportunities for Connected Data Ian Robinson (Neo Technology)
King's Suite
9:15 Plenary
Room: King's Suite
Tuesday Welcome Kaitlin Thaney (Mozilla Science Lab), Edd Wilder-James (Silicon Valley Data Science)
9:20 Plenary
Room: King's Suite
The Great Railway Caper: Big Data in 1955 John Graham-Cumming (CloudFlare)
9:45 Plenary
Room: King's Suite
The Secret Life of Data Alasdair Allan (Babilim Light Industries)
10:00 Plenary
Room: King's Suite
Startup Showcase Winners Annouced
10:05 Plenary
Room: King's Suite
The Quiet Comfort of the Internet of Things Alexandra Deschamps-Sonsino (Founder of Good Night Lamp / Designswarm Founder)
10:25 Plenary
Room: King's Suite
Where's the Missing Data? Ben Goldacre (Bad Science)
11:15 Using data as a weapon to tackle climate change Gavin Starks (Open Data Institute)
13:30 Financial Data and Journalism - How Bloomberg Makes Data Work Marianne Bouchart (Bloomberg News)
14:00 Hacking Data For Hacks Aron Pilhofer (New York Times), Mirko Lorenz (Deutsche Welle), Nicolas Kayser-Bril (Journalism++), Liliana Bounegru (European Journalism Centre)
14:30 Transparency Transformed: From Data to Insight Kristian Hammond (Narrative Science)
16:00 Knowing What To Do With Data Tim Barker (DataSift)
16:30 Data Science for Agile Strategy: From Formula 1 to the Boardroom Simon Williams (QuantumBlack), Jacomo Corbo (QuantumBlack)
17:00 Analyzing 3 Million Spreadsheets Felienne Hermans (Delft University of Technology)
17:30 Ignite sponsored by HP
Room: King's Suite
Ignite Strata + Velocity
Room 1-6
11:15 How to Fail Your Big Data Project Quick and Rapidly Isabel Drost (Apache Software Foundation/ Nokia Gate 5 GmbH), Hannes Kruppa (Nokia Maps)
13:30 Hadoop and Beyond: Real World Architectures Edouard Servan-Schreiber (10gen), Duncan Ross (TES Global)
14:00 What's New in Hadoop MapReduce 2? Tom White (Cloudera)
14:30 Data Availability and Integrity in Apache Hadoop Steve Loughran (Hortonworks)
16:00 Customer Behavior Modeling at Scale Ted Dunning (MapR Technologies)
16:30 People Watching with Machine Learning Alasdair Allan (Babilim Light Industries), Zena Wood (University of Exeter)
10:45-11:15 (30m)
Break: Morning Break
12:10-13:30 (1h 20m)
Tuesday Lunchtime BoFs
Birds of a Feather (BoF) sessions are informal roundtable discussions happening during lunch on both days of the conference. You can join any BoF table or start your own with a topic of your choice. The BoF sign-up board will be near the Registration area.
14:55-16:00 (1h 5m)
Break: Afternoon Break
11:15-11:40 (25m) Business & Industry
The Development of Privacy law - Protecting Celebrities at the Expense of Everyone Else?
Helen Child (Helen Child Consulting)
The furore over phone hacking has again led to demands for a privacy law, whilst media scorn at super-injunctions protecting celebrities from sex scandals pulls the debate in the other direction. But the average Joe/sephine seems happy to populate their social media presence with increasing amounts of detail. Is privacy only a concern of the famous? And should they be shaping law and policy?
11:45-12:10 (25m) Business & Industry
Open Data - a new tool for government
John Sheridan (The National Archives), Jeni Tennison (Open Data Institute)
To what public sector problems is open data the solution? This talk will describe how opening data has allowed The National Archives to introduce a new operating model for revising legislation, updating the government's legislation database, bringing private investment in to improve open, public and free, legislation data. It will describe the operating model and technology behind this approach.
13:30-13:55 (25m) Visualization & Interface
Subtlety and softness in data-driven art. It’s not all infographics and screen-based visualizations.
Julie Freeman (Queen Mary University of London)
In this session I’ll present recent work and research, and discuss how subjectivity can be used to create subtle and emotive, physical data-driven artwork that demonstrate patterns in data.
14:00-14:25 (25m) Visualization & Interface
Machine as Collaborator
JD Vogt (Salesforce)
Through big data technologies we can now begin to consider the machine as an active participant in our experiences and decisions - true collaborators. This talk will discuss the foundations of creativity and intuition, show examples of how machines are augmenting our decisions today and the roles they will play in the future, and explore how our traditional interfaces will disappear as a result.
14:30-14:55 (25m) Data Science: Career & Culture, Data That Matters
clearScience: Dragging Scientific Communication into the Information Age
Brian Bot (Sage Bionetworks), Erich Huang (Sage Bionetworks)
Data-intensive scientific communication is broken. Ironically, the components necessary for open and executable science exist in isolation. clearScience is a pilot at Sage Bionetworks to assemble these components—data, code, and compute infrastructure into a stack that not only facilitates effective reporting of science, but delivery of the science itself.
16:00-16:25 (25m) Visualization & Interface
Best practices for publishing data
Hjalmar Gislason (DataMarket)
You want to publish your data for clients, developers or the general public to use and enjoy. But which file formats to use? Which standards? How to provide an API? Should you visualize the data? And if so, how? DataMarket has been on the receiving end of data from many of the World's key data providers and is now helping leading information companies publishing theirs. Here we share our findings.
16:30-16:55 (25m) Visualization & Interface
Democratising data journalism with the Miso project
Irene Ros (Bocoup), Alastair Dant (Guardian News and Media), Alex Graul (Guardian News & Media)
Learn how the Miso Project - an open source toolkit - can help build engaging interactive content and let authors focus on telling stories with data.
17:00-17:25 (25m) Data Science
Zooniverse: Web-scale Citizen Science
Arfon Smith (University of Oxford)
Web-scale citizen science such as Zooniverse (www.zooniverse.org) has provided a temporary solution to the flood of data that confronts researchers of 21st century, however the solution is a short-term one. In this presentation I will outline a potential strategy for combining a large web community and significant compute resources to create a scalable, intelligent classification engine.
18:30-21:30 (3h)
Data Science London Meetup (Community Event)
Data Science London will host their meetup at Strata Conference London on October 2nd.
11:15-11:40 (25m) Sponsored
Powering Next-Generation Data Architectures with Apache Hadoop
Shaun Connolly (Hortonworks)
In this talk Shaun Connolly, VP Corporate Strategy for Hortonworks, will look at Hadoop's opportunity and the value it can unlock. Along the way he will discuss the kind of efforts required from the community, the solution ecosystem, and the enterprise in order to solidify Hadoop's place within the enterprise.
16:30-16:55 (25m) Data Science
New Opportunities for Connected Data
Ian Robinson (Neo Technology)
Today's complex data is not only big, but also semi-structured and densely connected. In this session we'll look at how size, structure and connectedness have converged to transform the data landscape.
9:15-9:20 (5m)
Tuesday Welcome
Kaitlin Thaney (Mozilla Science Lab), Edd Wilder-James (Silicon Valley Data Science)
Program Chairs, Edd Dumbill and Kaitlin Thaney, welcome you to the second day of Strata in London keynotes.
9:20-9:45 (25m)
The Great Railway Caper: Big Data in 1955
John Graham-Cumming (CloudFlare)
It's 1951 and you've got the world's first business computer and you've just been handed a Big Data problem. Go!
9:45-10:00 (15m)
The Secret Life of Data
Alasdair Allan (Babilim Light Industries)
Big data isn't just multi-terabyte datasets hidden inside eventually-concurrent distributed databases in the cloud. It’s also about the hidden data you carry with you all the time, data that is generated for you and about you, but not necessarily by you. Hidden data, your data, carrying on its secret life without your knowledge, but with your implicit and implied consent.
10:00-10:05 (5m)
Startup Showcase Winners Annouced
Winners of the Startup Showcase are announced.
10:05-10:25 (20m)
The Quiet Comfort of the Internet of Things
Alexandra Deschamps-Sonsino (Founder of Good Night Lamp / Designswarm Founder)
Alexandra Deschamps-Sonsino, Founder of Good Night Lamp / Founder of Designswarm
10:25-10:45 (20m)
Where's the Missing Data?
Ben Goldacre (Bad Science)
Data is great. Data is powerful. But when some data is missing, bias can be introduced, distorting the overall picture.
11:15-11:40 (25m) Data That Matters
Using data as a weapon to tackle climate change
Gavin Starks (Open Data Institute)
We live on a finite and bounded planet. This fact seems largely ignored in our global economic systems. AMEE has compiled millions of environmental data points. We are now combining them with large-scale financial data to create a "forcing function" that will drive mainstream environmental sustainability.
11:45-12:10 (25m) Data That Matters
How to Save the World from Big Data: Tactics for Making Cloud Computing Massively Greener
Francine Bennett (Mastodon C)
Cloud computing enables cool massive-scale data analysis, but has a very large carbon footprint, especially since many public cloud providers are powered by coal-fired grids: annual data centre emissions are currently ~75 million tonnes and growing. This talk aims to increase understanding of the issue, and to demonstrate how to achieve big carbon reductions without reducing the analysis you do.
13:30-13:55 (25m) Business & Industry
Financial Data and Journalism - How Bloomberg Makes Data Work
Marianne Bouchart (Bloomberg News)
A presentation about how the biggest financial news organisation in the world is handling big data on a daily basis through its terminal system, and the new data journalism projects it is developing at the moment to deliver ground-breaking news to the most influential people in the world.
14:00-14:25 (25m) Data That Matters
Hacking Data For Hacks
Aron Pilhofer (New York Times), Mirko Lorenz (Deutsche Welle), Nicolas Kayser-Bril (Journalism++), Liliana Bounegru (European Journalism Centre)
Data journalism blurs the line between coders, data geeks and journalists. The Data Journalism Handbook encourages journalists to treat data as a source and to pick up their computer to try new ways of reporting. This session highlights key lessons from the book, including a) getting stories from data (big or small) b) business models for data driven newsrooms and c) how to get started.
14:30-14:55 (25m) Business & Industry
Transparency Transformed: From Data to Insight
Kristian Hammond (Narrative Science)
In this talk, I will look at the next step in big data in general and open data in particular: transparency of insight and how the intelligent transformation of data into narratives can bring to light the stories within it and enable the higher level of understanding and insight needed to support evidence-based decision-making.
16:00-16:25 (25m) Business & Industry
Knowing What To Do With Data
Tim Barker (DataSift)
Data, data everywhere... from our unique experience providing social data to hundreds of customers, we have learned the biggest problems you'll encounter when you get all the data you wish for. In this session, we'll share these problems, some solutions and how we meet our own challenges in dealing with massive flows of social data
16:30-16:55 (25m) Business & Industry
Data Science for Agile Strategy: From Formula 1 to the Boardroom
Simon Williams (QuantumBlack), Jacomo Corbo (QuantumBlack)
Strategy has changed. The step-change in data abundance, speed and competition means that static business plans striving for that 'perfect answer' are obsolete. We'll demonstrate how the Data Science underpinning race strategy engines used in Formula One to plan, track and update strategy in real-time are enabling Fortune 500 be more agile, and creating a new way of strategy planning.
17:00-17:25 (25m) Business & Industry
Analyzing 3 Million Spreadsheets
Felienne Hermans (Delft University of Technology)
Spreadsheets are used almost everywhere, for almost everything. Researchers from Delft University of Technology have studied spreadsheet users and their spreadsheets to learn more on how exactly they are built, maintained and migrated. In this session we present a case study concerning the analysis of 3 million spreadsheets we analyzed.
17:30-18:30 (1h)
Ignite Strata + Velocity
If you had five minutes on stage what would you say? What if you only got 20 slides and they rotated automatically after 15 seconds? We’ll find out again this year, following the last day of Strata in London and the first day of Velocity Europe — for one big, combined, rip-roaring Ignite event.
11:15-11:40 (25m) Data Science
How to Fail Your Big Data Project Quick and Rapidly
Isabel Drost (Apache Software Foundation/ Nokia Gate 5 GmbH), Hannes Kruppa (Nokia Maps)
Failing software projects already is easier than we'd love to admit. When dealing with big data - a topic hyped quite a bit - the chance of projects failing miserably are even higher. This talk highlights some of the most prominent anti-patterns when dealing with data analysis, scaling and data science.
11:45-12:10 (25m) Hadoop: Tools & Technology
Letting More Developers Dance with Elephants: What We Learned
Tim Mallalieu (Microsoft)
In this session we’ll discuss our experience extending Hadoop development to new platforms and languages, and key aspects of using non-JVM languages in the Hadoop environment.
13:30-13:55 (25m) Data Science, Hadoop: Tools & Technology
Hadoop and Beyond: Real World Architectures
Edouard Servan-Schreiber (10gen), Duncan Ross (TES Global)
A guide to how real world companies have architected their big data ecosystems, incorporating Hadoop, NoSQL and data warehouse technologies.
14:00-14:25 (25m) Hadoop: Tools & Technology
What's New in Hadoop MapReduce 2?
Tom White (Cloudera)
Apache Hadoop 2 has a new MapReduce engine, which is built on a new general resource management system for running distributed applications called YARN. This talk explains the architecture of YARN, and discusses what this means to users of MapReduce and related frameworks, and to developers writing new parallel processing applications.
14:30-14:55 (25m) Hadoop: Tools & Technology
Data Availability and Integrity in Apache Hadoop
Steve Loughran (Hortonworks)
Failures in the datacentre can threaten the availability and data in your Hadoop cluster unless you have strategies to reduce this risk. This talk uses real customer data to introduce the threats to data integrity and availability -and shows how to a minimize the risks.
16:00-16:25 (25m) Data Science
Customer Behavior Modeling at Scale
Ted Dunning (MapR Technologies)
Nearest neighbor (k-nn for short) models are conceptually just about the simplest kind of behavioral model possible but are generally considered infeasible for production. This talk will describe the knn project and how it can reduce thousand-year computations to a few hours or make real-time use of k-nn models practical. Practical results will be shown and implementation methods described.
16:30-16:55 (25m) Data Science
People Watching with Machine Learning
Alasdair Allan (Babilim Light Industries), Zena Wood (University of Exeter)
Observing how other humans interact is so interesting that we do it recreationally, we call it "people watching". Evolution has equipped us both with a desire to people watch, and with the tools we need to do it, but it's hard to describe what it is we're doing. If we could, we could make our machines people watch for us, potentially yielding novel insights into our own social interactions.
17:00-17:25 (25m) Data Science
Cleaning Gritty Data to tell a story in the News Room or the Board Room
Thomas Levine (csv soundsystem)
Masters at web scraping and data journalism from ScraperWiki tell tales and give practical advice from years of cleaning data. What are common gotchas when fixing up data before you make it do something, and how do you get round them? Illustrated with real examples from the world of journalism and business.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com or +1 (707) 827-7148

Media Partner Opportunities

For information on trade opportunities contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts.