Presented By O'Reilly and Cloudera
Make Data Work
Sept 29–Oct 1, 2015 • New York, NY

IoT & Real-time conference sessions

Data collected and generated by things—including the difficulties of storing, analyzing, and publishing such information; and the challenges of extracting understandable, meaningful insights from the resulting torrent.

Tuesday, September 29

1:30pm–5:00pm Tuesday, 09/29/2015
Location: 3D 04/09 Level: Advanced
Patrick McFadin (DataStax)
Average rating: ****.
(4.53, 15 ratings)
This tutorial is all about managing large volumes of data coming at your data center fast and continuously. If you don't have a strategy, then allow me to help. Amazing Apache Project software can make this problem a lot easier to deal with. Spend a few hours and learn about how each part works, and how they work together. Your users will thank you. Read more.

Wednesday, September 30

11:20am–12:00pm Wednesday, 09/30/2015
Location: 3D 02/11 Level: Intermediate
Gwen Shapira (Confluent), Jeff Holoman (Cloudera)
Average rating: ****.
(4.33, 21 ratings)
Kafka provides the low latency, high throughput, high availability, and scale that financial services firms require. But can it also provide complete reliability? In this session, we will go over everything that happens to a message - from producer to consumer, and pinpoint all the places where data can be lost - if you are not careful. Read more.
1:15pm–1:55pm Wednesday, 09/30/2015
Location: 3D 02/11 Level: Intermediate
Charles Givre (Deutsche Bank)
Average rating: ****.
(4.40, 5 ratings)
Many people are acquiring smart devices, and yet do not have an understanding of the data these devices gather about them and what can be done with this data if it is aggregated over time. The talk will demonstrate what data several popular devices—including the Nest Thermostat and a few others—gather and show what can be learned about an individual from this data. Read more.
2:05pm–2:45pm Wednesday, 09/30/2015
Location: 3D 02/11
Karthik Ramasamy (Streamlio)
Average rating: ***..
(3.93, 14 ratings)
This talk will present the design and implementation of a new system, called Heron, that is now the de facto stream data processing engine inside Twitter. Share our experiences in running Heron in production. Read more.
2:55pm–3:35pm Wednesday, 09/30/2015
Location: 3D 02/11 Level: Intermediate
Jim Scott (NVIDIA)
Average rating: ***..
(3.50, 6 ratings)
With the move to real-time data analytics and machine learning, streaming applications are becoming more relied upon than ever before. Discover how to build and deploy a globally scalable streaming system. This includes producing messages in one data center and consuming them in another data center, as well as how to make the guarantees that nothing is ever lost. Read more.
4:35pm–5:15pm Wednesday, 09/30/2015
Location: 3D 02/11 Level: Intermediate
Hari Shreedharan (Cloudera), Anand Iyer (Cloudera)
Average rating: ***..
(3.17, 6 ratings)
Over the past year, Spark Streaming has emerged as the leading platform to implement IoT and similar real-time use cases. This session includes a brief introduction to Spark Streaming’s micro-batch architecture for real-time stream processing, as well as a live demo of an example use case that includes processing and alerting on-time series data (such as sensor data). Read more.
5:25pm–6:05pm Wednesday, 09/30/2015
Location: 3D 02/11
Ian Eslick (VitalLabs)
Average rating: ***..
(3.50, 2 ratings)
Capturing and integrating device-based and other health data for research is frustratingly difficult. We explain the open source technology frame​work for capturing and routing device-based health data for use by healthcare providers and for access, via a trusted analytic container, to ​​researchers we developed, working with O’Reilly Media and support from the Robert Wood Johnson Foundation.​ Read more.

Thursday, October 1

11:20am–12:00pm Thursday, 10/01/2015
Location: 3D 02/11 Level: Non-technical
Average rating: *....
(1.00, 1 rating)
By 2020, researchers estimate there will be 100 million internet connected devices. To process this data in real time—whether from mobile phones or jet engines—will be the new normal. How are companies today adapting to this new real time stream of data? Read more.
1:15pm–1:55pm Thursday, 10/01/2015
Location: 3D 02/11 Level: Non-technical
Tags: iot
Yan Zhang (Microsoft)
Average rating: ***..
(3.50, 4 ratings)
This talk introduces the landscape and challenges of predictive maintenance applications in the industry, illustrates how to formulate (data labeling and feature engineering) the problem with three machine learning models (regression, binary classification, multi-class classification), and showcases how the models can be conveniently trained and compared with different algorithms. Read more.
2:05pm–2:45pm Thursday, 10/01/2015
Location: 3D 02/11 Level: Intermediate
Ankur Gupta (Bitwise Inc.)
Average rating: ***..
(3.44, 9 ratings)
Using an open source technology stack, we implemented a solution for real-time analysis of sensor data from mining equipment. We will share the technical architecture used to show the tools we implemented for real-time complex event processing, why we implemented Spark instead of Storm, some of the challenges faced, benchmarks achieved, and tips for easy integration. Read more.
2:55pm–3:35pm Thursday, 10/01/2015
Location: 3D 02/11 Level: Intermediate
Fangjin Yang (Imply), Gian Merlino (Imply)
Average rating: ***..
(3.75, 8 ratings)
The maturation and development of open source technologies has made it easier than ever for companies to derive insights from vast quantities of data. In this session, we will cover how to build a real-time analytics stack using Kafka, Samza, and Druid. This combination of technologies can power a robust data pipeline that supports real-time ingestion and flexible, low-latency queries. Read more.
4:35pm–5:15pm Thursday, 10/01/2015
Location: 3D 02/11 Level: Intermediate
Susanna Pirttikangas (University of Oulu)
Average rating: *****
(5.00, 2 ratings)
Oulu Smart City has a lively living lab tradition; we continuously collect data and expand our ecosystem of companies, research institutes, city officials, and citizens, and develop data-intensive services on top of the ecosystem. We present real use cases implementing big data platforms and development of higher level distributed reasoning and machine learning to exploit our data lake. Read more.