Strata in London 2013 Schedule

Below are the confirmed and scheduled talks at Strata in London 2013 (schedule subject to change).

Customize Your Own Schedule

Create your own Strata schedule using the personal scheduler function. Mark the sessions, keynotes, and events you want to attend by selecting the calendar icon [calendar icon] next to each listing. Then go to your personal schedule and get your own customized schedule generated.

Monday, 11/11/2013

8:00

Monday, 11/11/2013
Location: King's Suite Foyer
Morning Coffee Service (1h)

9:00

Add to your personal schedule
Monday, 11/11/2013
Location: King's Suite
Edd Wilder-James (Silicon Valley Data Science), Kaitlin Thaney (Mozilla Science Lab)
Average rating: ***..
(3.60, 5 ratings)
Program Chairs, Edd Dumbill and Kaitlin Thaney, open the first day of keynotes. Read more.

9:10

Add to your personal schedule
Monday, 11/11/2013
Location: King's Suite
Doug Cutting (Cloudera)
Average rating: ***..
(3.15, 33 ratings)
As technology further pervades enterprises, each generates more data. Once harnessed, this data can enhance business, enabling growth. A new home for data has arrived to better support this: the Enterprise Data Hub, with Apache Hadoop at its center. Doug will discuss the trends that drive this and speculate on where they lead. Read more.

9:30

Add to your personal schedule
Monday, 11/11/2013
Location: King's Suite
Duncan Ross (TES Global)
Average rating: ***..
(3.73, 26 ratings)
Big data has proved it's worth in a number of industries, but it's not the size or the storage that is making the difference. The organisations that are delivering most value are the ones that have realised the need to drive analytics into the heart of their decision making process. Read more.

9:40

Add to your personal schedule
Monday, 11/11/2013
Location: King's Suite
Felienne Hermans (Delft University of Technology)
Average rating: ****.
(4.40, 40 ratings)
In this talk Felienne will summarize her recently completed PhD research on the topic of spreadsheet structure visualization, spreadsheet smells and clone detection, as well as presenting a sneak peek into the future of spreadsheet research as Delft University. Read more.

9:55

Add to your personal schedule
Monday, 11/11/2013
Location: King's Suite
Max Ogden (Independent)
Average rating: ***..
(3.65, 26 ratings)
Dat aims to bring a distributed collaboration flow to big data. Git and Github have done it for source code, but we don't yet have a social data solution. Read more.

10:10

Add to your personal schedule
Monday, 11/11/2013
Location: King's Suite Level: Intermediate
Francine Bennett (Mastodon C)
Average rating: ***..
(3.93, 27 ratings)
The NHS produces an amazing amount of detailed raw data about health, prescribing, doctors, hospitals, and so on. The data's a great resource for data scientists to experiment with and learn on - it's very rich, interesting, and important to society. This session will discuss the available datasets and work through some example analyses of the data from different perspectives. Read more.

10:25

Add to your personal schedule
Monday, 11/11/2013
Location: King's Suite
Tim Kelsey (National Health Service)
Average rating: ***..
(3.50, 24 ratings)
Keynote by Tim Kelsey, National Director for Patients and Information, National Health Service. Read more.

10:50

Monday, 11/11/2013
Location: Monarch Suite
Morning Break - Sponsored by MammothDB (30m)

11:20

Add to your personal schedule
Monday, 11/11/2013
Business & Industry, Data Science, Open Data
Location: King's Suite - Balmoral Level: Intermediate
Ian Hegerty (Facebook)
Average rating: ***..
(3.67, 9 ratings)
In January Facebook launched Graph Search in the US which allows users to search their social graph. Ian Hegerty will describe how the Graph Search corpus was built from Facebook's entity graph, and how big data is used to understand users queries and provide relevant results, with minimal initial user behavioral data. Read more.
Add to your personal schedule
Monday, 11/11/2013
Data Science, Tools & Technology
Location: King's Suite - Sandringham Level: Non-technical
Patrick Wendell (Databricks)
Average rating: ****.
(4.67, 12 ratings)
As big data analytics evolves beyond simple batch jobs, there is a need for both lower-latency processing (interactive queries and steam processing) and more complex analytics (e.g. machine learning, graph algorithms). This talk will introduce Spark and Shark, popular open source projects from Berkeley that address this need through an optimized runtime engine and in-memory computing capabilities. Read more.
Add to your personal schedule
Monday, 11/11/2013
Sponsored
Location: Westminster
Michael Hausenblas (Red Hat)
Average rating: ****.
(4.33, 6 ratings)
I will discuss and showcase polyglot persistence and the lambda architecture. Based on real-world examples and case studies from our customer base and the wider community of practitioners, incl. the financial, energy & media industries and the realm of public data sources (government data), we will elaborate on opportunities and challenges of these new data management and processing memes. Read more.
Add to your personal schedule
Monday, 11/11/2013
Data Science, Design, Open Data, Tools & Technology
Location: Palace Suite - Buckingham Room Level: Non-technical
Average rating: ***..
(3.50, 2 ratings)
We're getting better all the time. See how the Cato Institute used responsive design and D3.js to show how human development indicators improve as economic freedom spreads. Read more.
Add to your personal schedule
Monday, 11/11/2013
Data Science
Location: Palace Suite - Blenheim Room Level: Intermediate
Jan Overgoor (Airbnb)
Average rating: ***..
(3.00, 7 ratings)
For a two-sided marketplace like Airbnb, the search engine is the main driver of the health of the business. We developed an open-source technology stack and a set of analytical methods to optimize the search experience for our users and search conversion for our business. We’ll discuss the tools we use for data crunching, analysis and reporting, as well as our thoughts on experimental design. Read more.

12:00

Monday, 11/11/2013
Location: Monarch Suite
Lunch - Sponsored by Teradata (1h 15m)

13:15

Add to your personal schedule
Monday, 11/11/2013
Business & Industry, Data Science, Design, Tools & Technology
Location: King's Suite - Balmoral Level: Non-technical
Average rating: ***..
(3.31, 16 ratings)
How Stuff Spreads looks at how two recent memes spread online: Gangnam Style vs Harlem Shake. The talk dissects the memes through the lens of big data to show what made them go viral, what do they have in common, how quantitative and qualitative analysis have to come together to craft insights and tell a story, and finally how to predict future memes and create a data-driven content strategy. Read more.
Add to your personal schedule
Monday, 11/11/2013
Tools & Technology
Location: King's Suite - Sandringham Level: Advanced
Amy Unruh (Google), Felipe Hoffa (Google), Alasdair Allan (Babilim Light Industries)
Average rating: **...
(2.19, 16 ratings)
In May 2013, the O'Reilly Data Sensing Lab collaborated with the Google Cloud Platform and Device Cloud by Etherios, to deploy a network of hundreds of environmental sensors at Google I/O. Learn how the Google Cloud Platform was used to build an end-to-end, scaleable, and high-throughput pipeline for data collection, processing, and analysis. Read more.
Add to your personal schedule
Monday, 11/11/2013
Sponsored
Location: Westminster
Steven Totman (Syncsort), Matt Brandwein (Cloudera)
Average rating: **...
(2.20, 5 ratings)
Mainframe is Big Data too! Leveraging it in Hadoop creates a remarkable competitive advantage, but exploiting it without the right tools is nearly impossible, requiring you to wrestle with thousands of lines of Java, Pig, Hive, COBOL and more. This session presents a smarter way to ingest and process mainframe data in Hadoop, and how to bridge the technical, skill and cost gaps between the two. Read more.
Add to your personal schedule
Monday, 11/11/2013
Data Science
Location: Palace Suite - Buckingham Room Level: Intermediate
Andrew Hill (Set), Robin Kraft (World Resources Institute), Javier de la Torre (Vizzuality)
Average rating: ****.
(4.25, 8 ratings)
Maps are powerful tools for people to learn from data. In this project, we combine large-scale data processing with Hadoop and data visualization through CartoDB to make over six years of bi-monthly deforestation data accessible in an interactive map on the web. This talk will tell the story of how large-scale data paired with visualization can make data accessible in important new ways. Read more.
Add to your personal schedule
Monday, 11/11/2013
Design, Open Data
Location: Palace Suite - Blenheim Room Level: Non-technical
Julie Freeman (Queen Mary University of London)
Average rating: ****.
(4.00, 2 ratings)
A vending machine that dispenses crisps in response to financial news, laser sketches from 4000 years of solar eclipse data, and a mural created from QR codes acting as cellular automata. Featuring a diverse exploration of data, the ODI will showcase art commissioned and curated for its on-going Data as Culture programme which facilitates artists to explore data as an art material. Read more.

14:05

Add to your personal schedule
Monday, 11/11/2013
Business & Industry, Data Science, Tools & Technology
Location: King's Suite - Balmoral Level: Non-technical
Average rating: *....
(1.58, 12 ratings)
How do we know what we know? Increasingly discoveries are made from computed data, possibly sourced from the internet. If we are to trust these discoveries, how conclusions are reached is critical. Examples from work in Big Data analytics infrastructure for life sciences and social media analysis will illustrate the key issues. Read more.
Add to your personal schedule
Monday, 11/11/2013
Open Data, Tools & Technology
Location: King's Suite - Sandringham Level: Intermediate
Bruce Durling (Mastodon C)
Average rating: **...
(2.80, 10 ratings)
It has been said by many that 80% data science is scrubbing data. In this talk we'll cover how you can use Cascalog to scrub, transform, manipulate and mangle data into the formats you need, fix things that are wrong and filter out things that are broken. Clojure and Cascalog together provide fantastic tools for this. Learn about using Hadoop with the messy data that exists in the real world. Read more.
Add to your personal schedule
Monday, 11/11/2013
Sponsored
Location: Westminster
Brian Knox (Talksum, Inc.)
Average rating: **...
(2.67, 3 ratings)
In the era of M2M Communication and the Internet of Things on top of traditional 3V's of Big Data - Volume, Variety and Velocity we need to be able to process ephemeral data produced by dispersed sources which needs to be organized and distributed to multiple services. Real-time response, security, compliancy, compatibility, breaking silos - require new approaches to data management. Read more.
Add to your personal schedule
Monday, 11/11/2013
Business & Industry, Data Science, Tools & Technology
Location: Palace Suite - Buckingham Room Level: Intermediate
Mano Marks (Google, Inc. ), Kurt Schwehr (Google, Inc.)
Average rating: **...
(2.30, 10 ratings)
Many big data solutions focus on large data analysis that happens in data centers. Or they focus on data visualization in the browser. When you combine both of these techniques, you get amazing and expressive power. This talk will show how to use the Google Maps API with WebGL and Google Big Query, Cloud Storage, App Engine and Compute Engine to deliver amazing, responsive visualizations. Read more.
Add to your personal schedule
Monday, 11/11/2013
Business & Industry
Location: Palace Suite - Blenheim Room Level: Non-technical
Simon Everest (Government Digital Service)
Average rating: ***..
(3.00, 4 ratings)
Farmers in the UK receive subsidies via the Common Agricultural Policy. These form a significant part of their income, approx £2billion/yr total. Changes to the data driving the payments are submitted by farmers as hand drawn imagery. I'll share what the Government Digital Service has learned working with Defra, prototyping ways to capture changes that are sustainable, cheaper and more accurate. Read more.

14:45

Add to your personal schedule
Monday, 11/11/2013
Location: Monarch Suite
Average rating: ****.
(4.50, 2 ratings)
Startup Showcase will kick off during the afternoon break on Monday, and continue again during the Attendee Reception--all held at the Sponsor Pavilion. Read more.

15:35

Add to your personal schedule
Monday, 11/11/2013
Data Science, Design, Tools & Technology
Location: King's Suite - Balmoral Level: Intermediate
yodit stanton (opensensors.io)
Average rating: ***..
(3.38, 8 ratings)
Medical treatments have have come a long away in the last couple of decades. On the other hand, we could be doing a lot better in monitoring people within their own homes between hospital visits using sensors. Sensors combined with Big Data technologies are set to bring about profound changes for the future of health and social care. Read more.
Add to your personal schedule
Monday, 11/11/2013
Data Science
Location: King's Suite - Sandringham Level: Intermediate
Arshak Navruzyan (Argyle Data)
Average rating: ***..
(3.57, 7 ratings)
Fast read and write performance and scalability of distributed in-memory clusters is making it possible to retrain machine learning algorithms in real-time. The application of such algorithms to risk, infrastructure security and other areas can be transformative. Read more.
Add to your personal schedule
Monday, 11/11/2013
Sponsored
Location: Westminster
Christopher Hillman (Teradata)
Average rating: ***..
(3.71, 7 ratings)
In this presentation Chris will look at seven Map Reduce techniques that he enjoys playing with – the bits that are fun, exciting, and that can provide valuable insight into your big data. With examples of code (and where you can look for it) for web, image and text processing he'll show things that can quickly allow you to extend your analysis beyond traditional data mining. Read more.
Add to your personal schedule
Monday, 11/11/2013
Data Science, Open Data
Location: Palace Suite - Buckingham Room Level: Non-technical
Claire Miller (Trinity Mirror Regionals)
Average rating: **...
(2.71, 7 ratings)
How do you do data journalism when you are not the Guardian, the New York Times or the Washington Post? You don't need a data team, developers, much time or any funding to get started and produce data journalism that grabs headlines and engages readers. This workshop will focus on quick start techniques for getting started and making the most of few resources. Read more.
Add to your personal schedule
Monday, 11/11/2013
Business & Industry, Data Science
Location: Palace Suite - Blenheim Room Level: Non-technical
Simon Williams (QuantumBlack)
Average rating: ****.
(4.33, 9 ratings)
Crossrail will help deliver a new London. It is one of the largest civil engineering projects -- taking place literally under the feet of Strata London. We'll present how data science is being deployed at Crossrail to fundamentally change the way decisions are made and the operation is being run; from the CEO to the engineers monitoring ground movement in the tunnels. Read more.

16:25

Add to your personal schedule
Monday, 11/11/2013
Data Science
Location: King's Suite - Balmoral
Alasdair Allan (Babilim Light Industries)
Average rating: ****.
(4.40, 5 ratings)
Everyday objects are becoming smarter. In ten years’ time, every piece of clothing you own, every piece of jewelry, and every thing you carry with you will be measuring, weighing and calculating your life. In ten years, the world — your world — will be full of sensors. Read more.
Add to your personal schedule
Monday, 11/11/2013
Tools & Technology
Location: King's Suite - Sandringham Level: Intermediate
Isabel Drost (Apache Software Foundation/ Nokia Gate 5 GmbH)
Average rating: *....
(1.88, 8 ratings)
"In order to classify documents, simply first convert them to vectors, train, test and finally apply the model." Sounds easy - in theory. Converting documents to vectors usually is the tricky part. This talk walks you through the steps necessary to convert your text documents into feature vectors that Mahout classifiers can use including a few anecdotes on drafting domain specific features. Read more.
Add to your personal schedule
Monday, 11/11/2013
Sponsored
Location: Westminster
Brett Sheppard (Splunk)
Average rating: *....
(1.83, 6 ratings)
How can business and IT users easily explore, analyse and visualise data in Hadoop? Learn about alternatives to manually writing jobs or setting up predefined schemas and how a leading enterprise used Splunk and their Hadoop distribution to empower them with new access to Hadoop data. See how they got up and running in under an hour and enabled their developers to start writing big data apps. Read more.
Add to your personal schedule
Monday, 11/11/2013
Open Data
Location: Palace Suite - Buckingham Room Level: Non-technical
Lilia Saul (México Infórmate), Gabriela Morales (México Infórmate)
In Mexico open data journalism is still difficult to exercise. Find In Mexico, an organization to which I belong, we developed a research project in which we use databases and the result was a map (www.mexicoinformate.org/platform) that has managed to draw attention to the slow progress of the law enforcement system in the country. Read more.
Add to your personal schedule
Monday, 11/11/2013
Data Science, Ethics, Policy & Privacy
Location: Palace Suite - Blenheim Room Level: Non-technical
Stian Westlake (Nesta), Louise Marston (Nesta), Hasan Bakhshi (Nesta)
Average rating: ***..
(3.11, 9 ratings)
The economy is in a mess. But good data can help fix it. Timely analysis of large data sets is beginning to provide insight into what's really happening to business growth, employment and prosperity. We'll look at some of the most exciting examples of how Big Data is changing the way we look at the economy, and how governments and businesses can use them to their advantage. Read more.

17:15

Add to your personal schedule
Monday, 11/11/2013
Data Science
Location: King's Suite - Balmoral Level: Advanced
Alexander Kagoshima (Pivotal), Noelle Sio (Pivotal)
Average rating: ***..
(3.75, 12 ratings)
In the future we will see huge growth in the amount of traffic data generated through built-in car sensors. This talk presents a case study of analytics on traffic and traffic light data. Methods will be presented that yield a deep understanding of traffic and its characteristics by analyzing past traffic data. These methods could be extended to predict traffic jams and optimize routing systems. Read more.
Add to your personal schedule
Monday, 11/11/2013
Business & Industry
Location: King's Suite - Sandringham Level: Non-technical
Volkmar Uhlig (Adello)
Average rating: ****.
(4.23, 13 ratings)
The value of machine-generated data is highest at the moment it is generated. Operational data, sensor data and video feeds require new automated approaches to maximize the value of these streams of data. In this session, Dr. Volkmar Uhlig will explore how to apply the lessons learned from automated high-frequency trading systems to today’s Big Data problems to monetize information. Read more.
Add to your personal schedule
Monday, 11/11/2013
Business & Industry, Ethics, Policy & Privacy, Open Data
Location: Palace Suite - Buckingham Room Level: Non-technical
Chris Taggart (OpenCorporates)
Average rating: ***..
(3.50, 4 ratings)
Since it launched just 2 years ago, it has leveraged the open data community to grow to by far the largest open database of companies in the world with over 50 million companies in 70 jurisdictions, and is regularly used by journalists, anti-corruption investigators, civil society, even banks and financial institutions. Read more.
Add to your personal schedule
Monday, 11/11/2013
Business & Industry
Location: Palace Suite - Blenheim Room Level: Intermediate
Brett Sargent (LumaSense Technologies Inc.)
Average rating: ****.
(4.00, 1 rating)
The "smart grid" isn't just about smart meters. Every stage of our electrical power infrastructure has to be "smart," including generation, transmission and distribution. Sophisticated sensors connected to software platforms that continuously gather, visualize and analyze massive amounts of data in real time to produce actionable insights are critical to optimizing our energy assets. Read more.

18:00

Add to your personal schedule
Monday, 11/11/2013
Location: Sponsor Pavilion
Eileen Burbidge (Passion Capital), Alexandra Deschamps Sonsino (Good Night Lamp), JP Rangaswami (Salesforce)
Average rating: *****
(5.00, 1 rating)
Grab a drink, mingle with fellow attendees, and see the latest in big data technologies and products from leading companies at the Attendee Reception - happening Monday evening immediately following afternoon sessions. We'll also continue hosting Startup Showcase during the reception. Read more.

19:00

Add to your personal schedule
Monday, 11/11/2013
Location: Palace Suite - Buckingham Room
Average rating: ***..
(3.50, 2 ratings)
A special edition of the Big Data London Meetup will take place at Strata Conference venue on the evening of day 1, promising for a great crowd and amazing talks. Read more.
Add to your personal schedule
Monday, 11/11/2013
Location: Palace Suite - Blenheim Room
Big data & Hadoop fans can come together at this special Hadoop Users Group UK meetup at Strata - mingle with fellow Hadoopers for food, drinks, discussions, and networking. Read more.

Tuesday, 12/11/2013

8:00

Tuesday, 12/11/2013
Location: King's Suite Foyer
Morning Coffee Service (1h)

9:00

Add to your personal schedule
Tuesday, 12/11/2013
Location: King's Suite
Edd Wilder-James (Silicon Valley Data Science), Kaitlin Thaney (Mozilla Science Lab)
Average rating: ****.
(4.00, 1 rating)
Program Chairs, Edd Dumbill and Kaitlin Thaney, open the second day of keynotes. Read more.

9:10

Add to your personal schedule
Tuesday, 12/11/2013
Location: King's Suite
Average rating: ***..
(3.00, 1 rating)
Winners of the Startup Showcase are announced. Read more.

9:15

Add to your personal schedule
Tuesday, 12/11/2013
Location: King's Suite
Mark Madsen (Third Nature)
Average rating: ****.
(4.20, 20 ratings)
We hear stories of how big data is unprecedented and about the latest disruptive products to hit the market, products that are totally different and will change everything. Yet looking at the underlying concepts, most of these aren’t all that new and the ones that are new are being explained in the terms of the old, in the same way cars were described as “horseless carriages.” Read more.

9:35

Add to your personal schedule
Tuesday, 12/11/2013
Location: King's Suite
Gavin Starks (Open Data Institute)
Average rating: ***..
(3.56, 18 ratings)
Gavin Starks, CEO, Open Data Institute (ODI). Read more.

9:55

Add to your personal schedule
Tuesday, 12/11/2013
Location: King's Suite
Julie Steele (Silicon Valley Data Science)
Average rating: ***..
(3.83, 23 ratings)
Data science may seem like a revolutionary new field, but it is merely the latest incarnation of a tradition as old as we are: storytelling. And because it is part of such an inherently human practice, it is most valuable when it takes humanity into account. This talk explores how to use data and the techniques associated with data to build things that matter, by looking back to look forward. Read more.

10:15

Add to your personal schedule
Tuesday, 12/11/2013
Location: King's Suite
Average rating: ****.
(4.75, 36 ratings)
Keynote by James Burke, science and technology historian, futurist, and author. Read more.

10:50

Tuesday, 12/11/2013
Location: Monarch Suite
Morning Break (30m)

11:20

Add to your personal schedule
Tuesday, 12/11/2013
Data Science
Location: King's Suite - Balmoral Level: Intermediate
Sean Owen (Cloudera)
Average rating: ***..
(3.17, 6 ratings)
To keep analyzing more data, and faster, we need a secret weapon: cheating. In this brief survey, learn how you may be doing too much work in your analytics and learning processes, and how giving up a little accuracy can gain a lot of performance. With examples from Apache Hadoop, Mahout, and ML tools from Cloudera. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Tools & Technology
Location: King's Suite - Sandringham Level: Intermediate
Rajappa Iyer (LinkedIn)
Average rating: ****.
(4.67, 3 ratings)
To feed LinkedIn's data-driven products, we need to run a complex graph of ETL workflows that deliver the right data to the right systems reliably on a 24x7 basis. To achieve this goal, we have developed a metadata system that captures process dependencies, data dependencies, and execution histories -- this system also lays the foundation for a combined dataflow and workflow engine. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Sponsored
Location: Westminster
Marco Bressan (BBVA), Carme Artigas (Synergic Partners)
Average rating: ****.
(4.80, 5 ratings)
The large-scale deployment of a big data strategy in the retail financial services sector poses specific challenges in terms of infrastructure, data limitations, organizational structure and portfolio definition and execution. We will share how we are addressing these challenges as well as selected demos and solutions, with focus on promising new financial product lines enabled by big data. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Design
Location: Palace Suite - Buckingham Room Level: Non-technical
Shelley Evenson (Fjord)
Average rating: ***..
(3.00, 1 rating)
This session will cover the rapidly changing way that machines are taking on different parts of our lives, making decisions for us and altering our lives with our own data. Shelley Evenson will address how designers need to keep their human focus in order to truly capitalise on the benefits of big data without allowing technology to take over. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Business & Industry
Location: Palace Suite - Blenheim Room Level: Non-technical
Alistair Croll (Solve For Interesting)
Average rating: ****.
(4.67, 15 ratings)
The Lean Startup model showed a generation of founders how to launch companies smarter and faster. At the core of this model is a constant cycle of building, measuring, and learning. In this session, we'll look at the "measure" part of this cycle, and how organizations of all sizes can use data to build a better business faster. Read more.

12:00

Add to your personal schedule
Tuesday, 12/11/2013
Location: Monarch Suite
Lunch and Birds of a Feather (BoF) Discussions Read more.

13:15

Add to your personal schedule
Tuesday, 12/11/2013
Data Science
Location: King's Suite - Balmoral Level: Intermediate
Noel Welsh (Underscore Consulting)
Average rating: ***..
(3.83, 6 ratings)
Analytics is useless if it doesn't lead to action. It is often desirable to put a computer in control of decision making. In this talk I'll discuss bandit algorithms, a class of decision making algorithms that solve a simple but widely applicable decision problem, and have found application in ad serving, content recommendation, and more. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Data Science
Location: King's Suite - Sandringham
Hitesh Shah (Hortonworks), Siddharth Seth (Hortonworks)
Average rating: ***..
(3.40, 5 ratings)
Apache Hadoop has become popular from its specialization in the execution of MapReduce programs. However, it has been hard to leverage existing Hadoop infrastructure for various other processing paradigms such as real-time streaming, graph processing and message-passing. That was true until the introduction of Apache Hadoop YARN in Apache Hadoop 2.0. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Sponsored
Location: Westminster
Marcel Kornacker (Cloudera), Robin Stephenson (Mendeley Ltd)
Average rating: ****.
(4.00, 4 ratings)
Attendees will leave this session with a deeper understanding of how organizations are using Hadoop to solve real business problems today, and how recent advancements in the Hadoop ecosystem are expanding the platform's capabilities to serve larger enterprise requirements for a virtual EDW. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Data Science, Ethics, Policy & Privacy
Location: Palace Suite - Buckingham Room Level: Non-technical
Francine Bennett (Mastodon C), Duncan Ross (TES Global)
Average rating: ****.
(4.73, 11 ratings)
Being good is hard. Being evil is much more fun and gets you paid a lot more. We give a survey of the field of doing high-impact evil with data and analysis. We will look at some of the simplest things you can do to make the maximum (negative) impact on your friends, your business and the world. If you happen to learn something about doing good with data that will be your problem. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Business & Industry
Location: Palace Suite - Blenheim Room Level: Non-technical
Stephen Simpson (Independent)
Average rating: ***..
(3.40, 5 ratings)
Making data work requires that organisations define success for their company, provide clear business goals, & articulate the right business questions. The best approach to overcoming the cognitive pitfalls that lead to failing to ask the right question come from the intelligence services. This seminar outlines what they do, and suggests how to use it effectively inside a typical business. Read more.

14:05

Add to your personal schedule
Tuesday, 12/11/2013
Data Science
Location: King's Suite - Balmoral Level: Intermediate
Ulrich Rueckert (Datameer)
Average rating: ****.
(4.00, 5 ratings)
Even if one has big data, sometimes there is a lack of key data. This is a problem for predictive analytics: if there is only a limited amount of training material (e.g. user ratings, categorized documents), then it is hard to generate accurate models. The talk introduces new semi-supervised learning methods to overcome this problem by utilizing the vast amount of unlabeled data. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Data Science, Tools & Technology
Location: King's Suite - Sandringham Level: Intermediate
Tomer Shiran (Dremio)
Average rating: ***..
(3.78, 9 ratings)
Predictive Analytics has emerged as one of the primary use cases for Hadoop, leveraging various Machine Learning techniques to increase revenue or reduce costs. In this talk we provide real-world use cases from several different industries, and then discuss the open source technologies available to companies wishing to implement Predictive Analytics with Hadoop. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Sponsored
Location: Westminster
Andy Cotgreave (Tableau)
How can companies use social and business data together to gain insight? See how Tableau's native Google BigQuery connector links seamlessly to live data in BigQuery and creates interactive visualizations without writing a single line of code. Find out how to share your results on the web and mobile in minutes. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Data Science, Ethics, Policy & Privacy
Location: Palace Suite - Buckingham Room Level: Intermediate
Aurélie Pols (Mind Your Privacy)
Average rating: ***..
(3.00, 1 rating)
Analytics best practices, data feeds and flows between tools and continents are put in parallel with legislation, showing which steps to undertake for legal compliance; how to train for data protection & assure minimal liability. It’s not about security, goes beyond the cookie debate, highlighting how the EU Personal Data Protection Regulation will influence analytics & how Privacy by Design helps Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Business & Industry
Location: Palace Suite - Blenheim Room Level: Non-technical
Duncan Bloor (BBC)
Average rating: ***..
(3.83, 6 ratings)
Some big organisations love the idea of using data to inform decision making but find the reality a little daunting to say the least. How are we demystifying data in the BBC and overcoming editorial fears about it lessening the view of the trusted human in making content decisions? Read more.

14:45

Tuesday, 12/11/2013
Location: Monarch Suite
Afternoon Break (50m)

15:35

Add to your personal schedule
Tuesday, 12/11/2013
Data Science
Location: King's Suite - Balmoral Level: Intermediate
Jurgen Van Gael (Rangespan, Ltd)
Average rating: ****.
(4.46, 13 ratings)
As data scientists, uncertainty is all around us: data is noisy, missing, wrong or inherently uncertain. In this talk I want to introduce a branch of statistics called Bayesian reasoning which is a unifying, consistent, logical and practically successful way of handling uncertainty. In short, I'd like to convince people that Bayes rule is the E=MC^2 of data science. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Tools & Technology
Location: King's Suite - Sandringham Level: Intermediate
Alan Gates (Hortonworks)
Average rating: ****.
(4.40, 5 ratings)
People want more out of Hive. They want it to be fast, useful, and connect to their tools. Work is being done to reduce start up time, improve the optimizer, extend it to use Tez, process records 50x faster, add support for functions like RANK, add subqueries, and add standard SQL datatypes. We will review this work plus show current benchmarks. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Business & Industry, Data Science, Tools & Technology
Location: Palace Suite - Buckingham Room Level: Intermediate
Francois Mercier (mgrafit)
Average rating: **...
(2.75, 4 ratings)
To take the right decision, you need the right data. As complexity and abundance of data increase, the communication of data analysis results becomes more challenging. Grounding our talk in the pharma R&D arena, we illustrate how animated and interactive graphics can streamline communication on complex data analysis and inform decision making. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Business & Industry, Tools & Technology
Location: Palace Suite - Blenheim Room Level: Intermediate
Pascal Clarysse (TomTom)
Average rating: ****.
(4.00, 2 ratings)
Learn how hadoop is helping TomTom to make fresher maps by continuously processing the incoming GPS data and how hbase is used to present that data to an Operator Read more.

16:25

Add to your personal schedule
Tuesday, 12/11/2013
Data Science
Location: King's Suite - Balmoral Level: Intermediate
Stefan Franczuk (Cognizant)
Average rating: **...
(2.14, 7 ratings)
How do you indentify duplicate data and why is it important? What do you do with such data when you find it? Data Matching using the mathematics of probability has been around since the 1950’s. But, how does it actually work? What is the mathematics behind it? How do probabilities allow us to identify duplicate entries? Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Tools & Technology
Location: King's Suite - Sandringham Level: Intermediate
Neil Ferguson (NICE Systems)
Average rating: ***..
(3.33, 6 ratings)
NICE Systems is a leading provider of Customer Experience Management software, providing real-time offer management and predictive analytics applications based on HBase. We have recently migrated to HBase from our own custom-built data store, and in this session we will share the challenges we overcame getting HBase to perform to our demanding performance requirements. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Data Science, Design, Open Data
Location: Palace Suite - Buckingham Room Level: Intermediate
James Stewart (jystewart.net), James Abley (Government Digital Service)
Average rating: ****.
(4.67, 3 ratings)
The UK Government team behind the GOV.UK website talk about their work on the Performance Platform, a suite of services and a cultural shift taking people away from immensely detailed value stream maps about a call-centre and paper process (which might be an inherently 5-day long journey), to something that's digital, lightweight, fast and pleasant to use. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Business & Industry, Data Science, Tools & Technology
Location: Palace Suite - Blenheim Room Level: Non-technical
Sheldon Monteiro (SapientNitro), John Cain (SapientNitro), Thomas John Mcleish (SapientNitro)
Average rating: ***..
(3.00, 2 ratings)
78% of consumers use their smartphone while shopping in-store. What are they doing? More importantly, why? For all the media buzz around showrooming – look in-store, buy online - there is little insight on the issue. SapientNitro explains how key business questions drove hypotheses, data collection using novel instruments, and insights from analytic tools for testing and interpretive analysis. Read more.

17:15

Add to your personal schedule
Tuesday, 12/11/2013
Data Science
Location: King's Suite - Balmoral Level: Intermediate
Adam Kocoloski (Cloudant)
Average rating: ***..
(3.75, 4 ratings)
This talk will discuss how particle physics research can inform the field of data science. The importance of blind analyses and machine learning algorithms will be discussed as tools for filtering growing bodies of data as the big data trend continues. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Data Science, Tools & Technology
Location: King's Suite - Sandringham Level: Intermediate
Paul Lam (uSwitch)
Average rating: **...
(2.71, 7 ratings)
What questions would you ask if you have a Facebook-like graph of what your customer likes, what they bought, and what they viewed? This is what we built at uSwitch by transforming flat data from Hadoop into Neo4J. This talk will walk through how we bridged big data and linked data technologies and the results of such amalgamation. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Data Science
Location: Palace Suite - Buckingham Room Level: Intermediate
Piet Daas (Statistics Netherlands), Edwin De Jonge (Statistics Netherlands)
Average rating: ***..
(3.00, 1 rating)
Big Data are very interesting for official statistics. Results obtained by analyzing large amounts of Dutch traffic loop detection records, Mobile phone data and Dutch social media messages are discussed to illustrate this. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Data Science
Location: Palace Suite - Blenheim Room
Roger Magoulas (O'Reilly Media)
Average rating: **...
(2.00, 2 ratings)
How combining quantitative data analysis and qualitative social science work can complement each other, providing deeper understanding of behavior and open new doors of enquiry. Read more.

17:55

Tuesday, 12/11/2013
Location: King's Suite - Balmoral
TBC

18:30

Add to your personal schedule
Tuesday, 12/11/2013
Location: King's Suite
Nicola Hughes (ThoughtWorks), Duncan Ross (TES Global)
Average rating: ****.
(4.25, 4 ratings)
If you had five minutes on stage what would you say? What if you only got 20 slides and they rotated automatically after 5 seconds? We’ll find out again this year, the second day of Strata in London and the day before Velocity Europe—for one big, combined, rip-roaring Ignite event. Read more.
Add to your personal schedule
Tuesday, 12/11/2013
Location: Palace Suite - Buckingham Room
Data Science London will host their meetup at Strata Conference London on 12 November. Read more.

Wednesday, 13/11/2013

8:00

Wednesday, 13/11/2013
Location: King's Suite
Morning Coffee Service (1h)

9:00

Add to your personal schedule
Wednesday, 13/11/2013
Open Data, Tools & Technology
Location: King's Suite - Balmoral Level: Intermediate
Tom White (Cloudera)
Average rating: ***..
(3.00, 9 ratings)
In this tutorial we'll use the Cloudera Development Kit (CDK) to build a Java web app that logs application events to Hadoop, and then run ad hoc and scheduled queries against the collected data. Read more.
Add to your personal schedule
Wednesday, 13/11/2013
Data Science, Tools & Technology
Location: King's Suite - Sandringham Level: Intermediate
Markus Schmidberger (comSysto GmbH)
Average rating: ****.
(4.40, 5 ratings)
The tutorial will give a first introduction running Big Data Analyses in the statistical software R. R brings together latest Big Data technologies and latest high-level statistical methods. Bring your laptop, use your web browser to access a RStudio based analyses platform in the cloud and leave with a lot of new ideas for efficient Big Data analyses with R. Read more.

12:30

Wednesday, 13/11/2013
Location: Westminster - Park - Thames
Lunch (1h)

13:30

Add to your personal schedule
Wednesday, 13/11/2013
Tools & Technology
Location: King's Suite - Balmoral Level: Intermediate
Mischa Tuffield (PeerIndex), Davide Palmisano (PeerIndex Ltd.), Enno Shioji (PeerIndex)
Average rating: ***..
(3.50, 6 ratings)
This tutorial will describe how to process real-time streams and using the open-source Storm framework. We will define Storm's core concepts whilst focusing on creating a simple topology that counts, in real-time, key-words and hashtags seen in Twitter's public (1%) feed. Read more.
Add to your personal schedule
Wednesday, 13/11/2013
Design
Location: King's Suite - Sandringham Level: Non-technical
Average rating: ****.
(4.80, 5 ratings)
Communicating Data Clearly describes how to draw clear, concise, accurate graphs that are easier to understand than many of the graphs one sees today. The tutorial emphasizes how to avoid common mistakes that produce confusing or even misleading graphs. Graphs for one, two, three, and many variables are covered as well as general principles for creating effective graphs. Read more.

17:00

Wednesday, 13/11/2013
Location: Park Suite
TBC

17:30

Add to your personal schedule
Wednesday, 13/11/2013
Location: Park Suite
If you’re a woman in big data, webops, devops, and/or web performance, please come to this informal meetup on Wednesday evening for snacks, drinks, and networking with other women (and men) in the London tech community. This meetup is both transgender- and guy-friendly—everyone is welcome to attend! Read more.

Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences email mediapartners
@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts