Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY

State-of-the-art robot predictive maintenance with real-time sensor data

Mateusz Dymczyk (H2O.ai), Mathieu Dumoulin (MapR Technologies)
11:20am12:00pm Wednesday, September 27, 2017
Secondary topics:  IoT
Average rating: ****.
(4.00, 2 ratings)

Who is this presentation for?

  • Data engineers, data scientists, and project managers and executives working with the IoT and Industry 4.0

What you'll learn

  • Learn how to build real-time IoT pipelines by leveraging well-known, standard enterprise big data components such as H2O, TensorFlow, MapR, Kafka, and Spark
  • Explore real-world examples of how to get additional value from an existing IoT sensor data pipeline and practical example of streaming architecture benefits in action

Description

Industry 4.0 IoT applications promise vast gains in productivity from reduced downtime, higher product quality, and higher efficiency. Modern industrial robots integrate hundreds of sensors of all kinds, generating tremendous volumes of data rich in valuable information. However, the reality is that some of the most advanced industrial makers in the world are barely getting started making use of this data, with relatively rudimentary bespoke monitoring systems built at tremendous cost.

It is now possible to successfully deploy Industry 4.0 pilot use cases—using a well-chosen selection of big data enterprise products and open source projects— in a matter of months and at a small fraction of the cost of equivalent projects at leading high-tech makers. Mateusz Dymczyk and Mathieu Dumoulin showcase a working, practical, predictive maintenance pipeline in action and explain how they built a state-of-the-art anomaly detection system using big data frameworks like Spark, H2O, TensorFlow, and Kafka on the MapR Converged Data Platform.

This is an improved version of the pipeline Mateusz and Mathieu demonstrated at Strata Beijing. This pipeline uses data collected from a Bluetooth wireless movement sensor attached to a realistic model of a standard industrial robot.

Topics include:

  • How to integrate data from a second sensor type
  • Why the overall system predictions are better than models made from either data source taken separately
  • How easy it is to switch to a state-of-the-art LSTM anomaly detection model
  • A comparison with the baseline model
Photo of Mateusz Dymczyk

Mateusz Dymczyk

H2O.ai

Mateusz Dymczyk is a Tokyo-based software engineer at H20.ai, where he works as a researcher on machine learning and NLP projects. He works on distributed machine learning projects including the core H2O platform and Sparkling Water, which integrates H2O and Apache Spark. Previously, he worked at Fujitsu Laboratories. Mateusz loves all things distributed and machine learning and hates buzzwords. In his spare time, he participates in the IT community by organizing, attending, and speaking at conferences and meetups. Mateusz holds an MSc in computer science from AGH UST in Krakow, Poland.

Photo of Mathieu Dumoulin

Mathieu Dumoulin

MapR Technologies

Mathieu Dumoulin is a data scientist in MapR Technologies’s Tokyo office, where he combines his passion for machine learning and big data with the Hadoop ecosystem. Mathieu started using Hadoop from the deep end, building a full unstructured data classification prototype for Fujitsu Canada’s Innovation Labs, a project that eventually earned him the 2013 Young Innovator award from the Natural Sciences and Engineering Research Council of Canada. Afterward, he moved to Tokyo with his family, where he worked as a search engineer at a startup and a managing data scientist for a large Japanese HR company, before coming to MapR.