Presented By O'Reilly and Cloudera
Make Data Work
31 May–1 June 2016: Training
1 June–3 June 2016: Conference
London, UK

Schedule: Data innovations sessions

9:00–12:30 Wednesday, 1/06/2016
Location: Capital Suite 14 Level: Intermediate
Ian Wrigley (StreamSets)
Average rating: ****.
(4.52, 21 ratings)
Ian Wrigley leads a hands-on workshop on leveraging the capabilities of Apache Kafka to collect, manage, and process stream data for both big data projects and general-purpose enterprise data integration, covering key architectural concepts, developer APIs, use cases, and how to write applications that publish data to, and subscribe to data from, Kafka. No prior knowledge of Kafka is required. Read more.
13:30–17:00 Wednesday, 1/06/2016
Location: London Suite 2&3 Level: Non-technical
Tags: ai
Marc Warner (ASI)
Average rating: ****.
(4.50, 4 ratings)
In a hands-on tutorial designed for executives, product managers, and business leaders, Marc Warner explores what's possible (and not) with machine learning and what that means for businesses. Attendees will gain experience with cutting-edge artificial intelligence by building their very own handwriting recognition engine. No technical background required. Read more.
11:15–11:55 Thursday, 2/06/2016
Location: Capital Suite 14 Level: Advanced
Cliff Click (0xdata)
Average rating: ***..
(3.67, 3 ratings)
H2O is an in-memory, big-data, big-math machine-learning platform. Cliff Click offers a technical talk focused on the insides of H2O. Cliff explains how you can write simple, single-threaded Java code and have H2O autoparallelize and auto-scale-out to hundreds of nodes and thousands of cores. Read more.
11:15–11:55 Thursday, 2/06/2016
Location: Capital Suite 17 Level: Non-technical
Tags: government
Brian Hills (The Data Lab)
Average rating: ****.
(4.00, 4 ratings)
The Data Lab is an innovation center that delivers social and economic benefit to Scotland by bringing industry, the public sector, and academia together to exploit new opportunities from data. Brian Hills shares insights and lessons learned during the center's first 18 months, organized into three themes: collaborative innovation, nurturing skills and talent, and community building. Read more.
12:05–12:45 Thursday, 2/06/2016
Location: Capital Suite 14 Level: Intermediate
Sherry Moore (Google)
Average rating: ****.
(4.12, 17 ratings)
TensorFlow is an open source software library for numerical computation with a focus on machine learning. Its flexible architecture makes it great for research and production deployment. Sherry Moore offers a high-level introduction to TensorFlow and explains how to use it to train machine-learning models to make your next application smarter. Read more.
14:55–15:35 Thursday, 2/06/2016
Location: Capital Suite 14 Level: Intermediate
Julien Le Dem (WeWork)
Average rating: ****.
(4.77, 13 ratings)
In pursuit of speed and efficiency, big data processing is continuing its logical evolution toward columnar execution. Julien Le Dem offers a glimpse into the future of column-oriented data processing with Arrow and Parquet. Read more.
11:15–11:55 Friday, 3/06/2016
Location: Capital Suite 4 Level: Non-technical
Tal Guttman (Windward)
Average rating: *****
(5.00, 1 rating)
With over 90% of the world’s trade transported over the oceans, data on ship activity is critical to decision makers across industries. But despite the huge stakes at sea, ship activity remains a mystery: the data is massive, fragmented, and extremely unreliable when taken as is. Tal Guttman explores how data science can shed light on this critically important but opaque world. Read more.
11:15–11:55 Friday, 3/06/2016
Location: Capital Suite 14 Level: Intermediate
Tags: real-time, iot
Tyler Akidau (Google)
Average rating: ****.
(4.53, 17 ratings)
Tyler Akidau offers a whirlwind tour of the conceptual building blocks of massive-scale data processing systems over the last decade, comparing and contrasting systems at Google with popular open source systems in use today. Read more.
12:05–12:45 Friday, 3/06/2016
Location: Capital Suite 12 Level: Advanced
Tags: real-time
Kenneth Knowles (Google)
Average rating: ****.
(4.67, 3 ratings)
Drawing on important real-world use cases, Kenneth Knowles delves into the details of the language- and runner-independent semantics developed for triggers in Apache Beam, demonstrating how the semantics support the use cases as well as all of the above variability in streaming systems. Kenneth then describes some of the particular implementations of those semantics in Google Cloud Dataflow. Read more.
12:05–12:45 Friday, 3/06/2016
Location: Capital Suite 14 Level: Intermediate
Xavier Léauté (Confluent)
Average rating: ****.
(4.50, 6 ratings)
Xavier Léauté shares his experience and relates the challenges scaling Metamarkets's real-time processing to over 3 million events per second. Built entirely on open source, the stack performs streaming joins using Kafka and Samza and feeds into Druid, serving 1 million interactive queries per day. Read more.
14:05–14:45 Friday, 3/06/2016
Location: Capital Suite 12 Level: Intermediate
Tags: real-time, iot
Neha Narkhede (Confluent)
Average rating: ****.
(4.12, 8 ratings)
Neha Narkhede offers an overview of Kafka Streams, a new stream processing library natively integrated with Apache Kafka. It has a very low barrier to entry, easy operationalization, and a natural DSL for writing stream processing applications. As such, it is the most convenient yet scalable option to analyze, transform, or otherwise process data that is backed by Kafka. Read more.