Presented By O'Reilly and Cloudera
Make Data Work
31 May–1 June 2016: Training
1 June–3 June 2016: Conference
London, UK

Schedule: IoT & real-time sessions

13:30–17:00 Wednesday, 1/06/2016
Location: Capital Suite 14 Level: Intermediate
Patrick McFadin (DataStax)
Average rating: ****.
(4.09, 11 ratings)
We as an industry are collecting more data every year. IoT, web, and mobile applications send torrents of bits to our data centers that have to be processed and stored, even as users expect an always-on experience—leaving little room for error. Patrick McFadin explores how successful companies do this every day using the powerful Team Apache: Apache Kafka, Spark, and Cassandra. Read more.
11:15–11:55 Thursday, 2/06/2016
Location: Capital Suite 12 Level: Intermediate
Charles Givre (Deutsche Bank)
Average rating: ***..
(3.71, 7 ratings)
In the last few years, auto makers and others have introduced devices to connect cars to the Internet and gather data about the vehicles’ activity, and auto insurers and local governments are just starting to require these devices. Charles Givre gives an overview of the security risks as well as the potential privacy invasions associated with this unique type of data collection. Read more.
12:05–12:45 Thursday, 2/06/2016
Location: Capital Suite 12 Level: Intermediate
Gwen Shapira (Confluent), Jeff Holoman (Cloudera)
Average rating: ****.
(4.33, 3 ratings)
Kafka provides the low latency, high throughput, high availability, and scale that financial services firms require. But can it also provide complete reliability? Gwen Shapira and Jeff Holoman explain how developers and operation teams can work together to build a bulletproof data pipeline with Kafka and pinpoint all the places where data can be lost if you're not careful. Read more.
14:05–14:45 Thursday, 2/06/2016
Location: Capital Suite 12 Level: Intermediate
Eric Kramer (Dataiku)
Average rating: ****.
(4.00, 1 rating)
Dataiku and Bioserenity have built a system for an at-home, real-time EEG and, in the process, created an open source stack for handling the data from connected devices. Eric Kramer offers an overview of the tools Dataiku and Bioserenity use to handle large amounts of time series data and explains how they created a real-time web app that processes petabytes of data generated by connected devices. Read more.
14:55–15:35 Thursday, 2/06/2016
Location: Capital Suite 12 Level: Intermediate
Gopal GopalKrishnan (OSIsoft, LLC.), Hoa Tram (OSIsoft)
Average rating: **...
(2.25, 8 ratings)
For decades, industrial manufacturing has dealt with large volumes of sensor data and handled a variety of data from the various manufacturing operations management (MOM) systems in production, quality, maintenance, and inventory. Gopal GopalKrishnan and Hoa Tram offer lessons learned from applying big data ecosystem tools to oil and gas, energy, utilities, metals, and mining use cases. Read more.
16:35–17:15 Thursday, 2/06/2016
Location: Capital Suite 12 Level: Advanced
Simon Elliston Ball (Hortonworks)
Average rating: ****.
(4.38, 8 ratings)
Apache NiFi has seen it all. (It worked for the NSA after all.) What it brings to the Hadoop ecosystem is a series of data flow and ingest patterns, a GUI, and a lot of security and record-level data provenance. Simon Elliston Ball offers an overview of Apache NiFi and explores its innovations around content and provenance repositories. Read more.
16:35–17:15 Thursday, 2/06/2016
Location: Capital Suite 14 Level: Intermediate
Tags: real-time, iot
Slava Chernyak (Google)
Average rating: ****.
(4.44, 9 ratings)
Watermarks are a system for measuring progress and completeness in out-of-order stream processing systems and are used to emit correct results in a timely way. Given the trend toward out-of-order processing in current streaming systems, understanding watermarks is an increasingly important skill. Slava Chernyak explains watermarks and demonstrates how to apply them using real-world cases. Read more.
17:25–18:05 Thursday, 2/06/2016
Location: Capital Suite 7
Tags: iot
Moty Fania (Intel)
Moty Fania shares Intel’s IT experience implementing an on-premises IoT platform for internal use cases. The platform was based on open source big data technologies and containers and was designed as a multitenant platform with built-in analytical capabilities. Moty highlights the key lessons learned from this journey and offers a thorough review of the platform’s architecture. Read more.
17:25–18:05 Thursday, 2/06/2016
Location: Capital Suite 12 Level: Intermediate
Gwen Shapira (Confluent), Todd Palino (LinkedIn)
Average rating: ****.
(4.67, 9 ratings)
Apache Kafka lies at the heart of the largest data pipelines, handling trillions of messages and petabytes of data every day. Gwen Shapira and Todd Palino explain the right approach for getting the most out of Kafka, exploring how to monitor, optimize, and troubleshoot performance of your data pipelines from producer to consumer and from development to production. Read more.
17:25–18:05 Thursday, 2/06/2016
Location: Capital Suite 14 Level: Intermediate
Jim Scott (NVIDIA)
Average rating: ***..
(3.43, 7 ratings)
Application messaging isn’t new. Solutions like message queues have been around for a long time, but newer solutions like Kafka have emerged as high-performance, high-scalability alternatives that integrate well with Hadoop. Should distributed messaging systems like Kafka be considered replacements for legacy technologies? Jim Scott answers that question by delving into architectural trade-offs. Read more.
11:15–11:55 Friday, 3/06/2016
Location: Capital Suite 12 Level: Advanced
Tags: real-time, iot
Flavio Junqueira (Dell EMC)
Average rating: ****.
(4.00, 6 ratings)
Exactly-once semantics is a highly desirable property for streaming analytics. Ideally, all applications process events once and never twice, but making such guarantees in general either induces significant overhead or introduces other inconveniences, such as stalling. Flavio Junqueira explores what's possible and reasonable for streaming analytics to achieve when targeting exactly-once semantics. Read more.
14:55–15:35 Friday, 3/06/2016
Location: Capital Suite 2/3 Level: Intermediate
Tags: real-time, iot
Emil Andreas Siemes (Hortonworks), Stephan Anne (Hortonworks)
Average rating: **...
(2.00, 7 ratings)
The Internet of Things and big data analytics are currently two of the hottest topics in IT. But how do you get started using them? Emil Andreas Siemes and Stephan Anné demonstrate how to use Apache NiFi to ingest, transform, and route sensor data into Hadoop and how to do further predictive analytics. Read more.
14:55–15:35 Friday, 3/06/2016
Location: Capital Suite 12 Level: Intermediate
Tags: real-time, iot
Stephan Ewen (data Artisans), Kostas Tzoumas (data Artisans)
Average rating: ****.
(4.67, 3 ratings)
Data stream processing is emerging as a new paradigm for the data infrastructure. Streaming promises to unify and simplify many existing applications while simultaneously enabling new applications on both real-time and historical data. Stephan Ewen and Kostas Tzoumas introduce the data streaming paradigm and show how to build a set of simple but representative applications using Apache Flink. Read more.
14:55–15:35 Friday, 3/06/2016
Location: Capital Suite 13 Level: Intermediate
Karthik Ramasamy (Twitter)
Average rating: ***..
(3.00, 3 ratings)
Heron has been in production at Twitter for nearly two years and is widely used by several teams for diverse use cases. Karthik Ramasamy describes Heron in detail, covering a few use cases in-depth and sharing the operating experiences and challenges of running Heron at scale. Read more.
16:35–17:15 Friday, 3/06/2016
Location: Capital Suite 10/11 Level: Intermediate
Alasdair Allan (Babilim Light Industries)
Average rating: ****.
(4.00, 2 ratings)
Privacy is no longer "a social norm," but this may not survive as the Internet of Things grows. Big data is all very well when it is harvested in the background. But it's a very different matter altogether when your things tattle on you behind your back. Alasdair Allan explains how the rush to connect devices to the Internet has led to sloppy privacy and security and why that can't continue. Read more.
16:35–17:15 Friday, 3/06/2016
Location: Capital Suite 12 Level: Intermediate
Ignacio Manuel Mulas Viela (Ericsson), Nicolas Seyvet (Ericsson AB)
Average rating: ****.
(4.00, 1 rating)
ICT systems are growing in size and complexity. Monitoring and orchestration mechanisms need to evolve and provide richer capabilities to help handle them. Ignacio Manuel Mulas Viela and Nicolas Seyvet analyze a stream of telemetry/logs in real time by following the Kappa architecture paradigm, using machine-learning algorithms to spot unexpected behaviors from an in-production cloud system. Read more.