Presented By O'Reilly and Cloudera
Make Data Work
5–7 May, 2015 • London, UK
 

Strata + Hadoop World in London 2015 Schedule

Use the calendar icon [calendar icon] next to each listing you want to attend. Then use the personal schedule button below to generate your schedule.

Tuesday, 5 May

King's Suite - Balmoral
Add to your personal schedule
9:00 Architectural considerations for Hadoop applications Gwen Shapira (Confluent), Mark Grover (Lyft), Ted Malaska (Blizzard Entertainment), Jonathan Seidman (Cloudera)
Add to your personal schedule
13:30 Building an Apache Hadoop data application Tom White (Cloudera), Joey Echeverria (Rocana), Ryan Blue (Cloudera)
Add to your personal schedule
18:00 Plenary
Room: King's Suite - Balmoral
Startup Showcase
King's Suite - Sandringham
Add to your personal schedule
9:00 Spark Camp Paco Nathan (O'Reilly Media), Alex Sicoe (Elsevier)
Buckingham Room - Palace Suite
Blenheim Room - Palace Suite
Add to your personal schedule
9:00 Getting started with Apache Cassandra Christopher Batey (Freelance)
Add to your personal schedule
13:30 D3.js Tutorial - D3 and interactive visualizations for everyone! Sebastian Gutierrez (DashingD3js.com)
St. James / Regents
Add to your personal schedule
9:00 Introduction to machine learning with IPython and scikit-learn Olivier Grisel (Inria & scikit-learn)
Add to your personal schedule
13:30 Reproducible research with R and Shiny Garrett Grolemund (RStudio), Colin Gillespie (Jumping Rivers | Newcastle University)
Hilton Meeting Room 1-3
Add to your personal schedule
9:00 Apache Spark advanced training (Day 1) Olivier Girardot (Lateral Thoughts), Sameer Farooqui (Databricks)
Hilton Meeting Room 4-6
Add to your personal schedule
9:00 Cloudera essentials for Apache Hadoop Kai Voigt (Cloudera)
12:30 Break
Room: Windsor Suite / Thames Suite / Westminster Suite / Fiamma Restaurant
17:00 Dinner
Room: On Your Own
9:00-12:30 (3h 30m) Hadoop Platform
Architectural considerations for Hadoop applications
Gwen Shapira (Confluent), Mark Grover (Lyft), Ted Malaska (Blizzard Entertainment), Jonathan Seidman (Cloudera)
This tutorial will be valuable for developers, architects, or project leads who are already knowledgeable about Hadoop and are now looking for more insight into how it can be leveraged to implement real-world applications.
13:30-17:00 (3h 30m) Hadoop Platform
Building an Apache Hadoop data application
Tom White (Cloudera), Joey Echeverria (Rocana), Ryan Blue (Cloudera)
In the second (afternoon) half of the Architecture Day tutorial, attendees will apply the best practices they learned in the morning session to build a data application for sessionizing user data.
18:00-19:30 (1h 30m) Events
Startup Showcase
What new companies are at the leading edge of the data space? Meet some of the best, most innovative founders as they demonstrate their game-changing ideas at the Startup Showcase.
9:00-17:00 (8h) Tools & Technology
Spark Camp
Paco Nathan (O'Reilly Media), Alex Sicoe (Elsevier)
Spark Camp, organized by the creators of the Apache Spark project at Databricks, will be a day-long, hands-on introduction to the Spark platform, including Spark Core, the Spark Shell, Spark Streaming, Spark SQL, MLlib, and more.
9:00-17:00 (8h) Business & Industry
Data-Driven Business Day
All-Day: For business strategists, marketers, product managers, and entrepreneurs, Data-Driven Business looks at how to use data to make better business decisions faster. Packed with case studies, panels, and eye-opening presentations, this fast-paced day focuses on how to solve today's thorniest business problems with Big Data. It's the missing MBA for a data-driven, always-on business world.
9:00-12:30 (3h 30m) Tools & Technology
Getting started with Apache Cassandra
Christopher Batey (Freelance)
Interested in time series use cases? Need a database that can scale with your application? Apache Cassandra has proven to be one of the best solutions for storing and retrieving time series data at high velocity and high volume. This tutorial will provide an in-depth introduction to Cassandra data modeling internals and finish with an example application.
13:30-17:00 (3h 30m) Design
D3.js Tutorial - D3 and interactive visualizations for everyone!
Sebastian Gutierrez (DashingD3js.com)
D3.js has a very steep learning curve for learning how to create interactive visualizations. However, there are three main concepts that, once you get your head around them, will make the climb much easier. Focusing on these three main concepts, we will walk through many examples to teach the fundamental building blocks of creating D3.js based interactive visualizations.
9:00-12:30 (3h 30m) Data Science
Introduction to machine learning with IPython and scikit-learn
Olivier Grisel (Inria & scikit-learn)
Three-hour hands-on introductory workshop on predictive modeling and machine learning with open source tools from the Python community such as scikit-learn and IPython.
13:30-17:00 (3h 30m) Data Science
Reproducible research with R and Shiny
Garrett Grolemund (RStudio), Colin Gillespie (Jumping Rivers | Newcastle University)
Learn how to combine the best ideas of reproducible research into a simple, easy-to-use workflow with R. The Packrat, R Markdown, and Shiny packages let you (a) embed your code into reports to create a reproducible record of your work, (b) rerun the code to generate a new report as data and ideas change, and (c) export your reports into multiple formats, including pdfs and interactive web apps.
9:00-17:00 (8h) Training
Apache Spark advanced training (Day 1)
Olivier Girardot (Lateral Thoughts), Sameer Farooqui (Databricks)
This three-day curriculum features advanced lectures and hands-on technical exercises for advanced Spark usage in data exploration, analysis, and building big data applications. Course materials emphasize architectural design patterns and best practices for leveraging Spark in the context of other popular, complementary frameworks for building and managing enterprise data workflows.
9:00-17:00 (8h) Training
Cloudera essentials for Apache Hadoop
Kai Voigt (Cloudera)
Cloudera University's one-day essentials course presents an overview of Apache Hadoop and how it can help decision-makers meet business goals, providing a fundamental introduction to the main components of Hadoop and its use cases in various industries. This course is a good starting point for any role or set of objectives and is part of the data analyst learning path.
12:30-13:30 (1h)
Break
17:00-18:00 (1h)
Break: Dinner