Presented by O'Reilly and Cloudera
Make Data Work
July 12-13, 2017: Training
July 13-15, 2017: Tutorials & Conference
Beijing, China

机器人的预测性维护实战:解读实时、可扩展的分析管道 (Robot predictive maintenance in action: Real-time, scalable pipelines explained)

This will be presented in English.

Mathieu Dumoulin (McKinsey & Company), Mateusz Dymczyk (
14:50–15:30 Friday, 2017-07-14
物联网&实时计算 (IoT & real-time), 英文讲话 (Presented in English)
Location: 多功能厅6A+B(Function Room 6A+B) 观众水平 (Level): Intermediate

必要预备知识 (Prerequisite Knowledge)

A general understanding of big data technologies and machine learning

您将学到什么 (What you'll learn)

Explore a fully working pipeline from sensor to visualization explained step by step, learn how to apply anomaly detection on real-time streaming sensor data, and see a real application of modern big data streaming architecture in action

描述 (Description)



这个可用系统是一个预测性维护案例的实现。只有聪明地使用现代化的基于微服务的流式架构才让这一切成为可能。这个系统利用了MapR聚合数据平台(MapR Converged Data Platform)的独特特征来进行操作分析、消息系统和存储。机器学习的建模和部署则是使用H2O.ai来实现的。


Industry 4.0 IoT applications promise vast gains in productivity from reduced downtime, higher product quality, and higher efficiency. Modern industrial robots integrate hundreds of sensors of all kinds, generating tremendous volumes of data rich in valuable information. However, the reality is that some of the most advanced industrial makers in the world are barely getting started making use of this data, with relatively rudimentary, bespoke monitoring systems built at tremendous cost.

It is now possible to successfully deploy Industry 4.0 pilot use cases in a matter of months, at a small fraction of the cost of equivalent projects at leading high-tech makers, using a well-chosen selection of big data enterprise products and open source projects. Mathieu Dumoulin and Mateusz Dymczyk walk you step by step through building a working real-time ML-based anomaly detection system on a working industrial robot-analog installed with a wireless movement sensor. The working system is only made possible by a clever use of modern, microservices-based streaming architecture. You’ll learn how to gather data from a wireless movement sensor, process it with H2O on a MapR cluster, and visualize the output through an AR headset by an operator.

Photo of Mathieu Dumoulin

Mathieu Dumoulin

McKinsey & Company

Mathieu Dumoulin is a Digital Expert at McKinsey & Company’s Tokyo office, where he advises large enterprises for big data, enterprise architecture and advanced analytics solutions.
Current areas of interest are creating production systems which optimize industrial processes on operational data and real-time IoT sensor data.

Mateusz Dymczyk

Mateusz Dymczyk is a Tokyo-based software engineer at, the company behind H2O, the leading open source machine learning platform for smarter applications and data products. He works on distributed machine learning projects including the core H2O platform and Sparkling Water, which integrates H2O and Apache Spark. Previously, he worked at Fujitsu Laboratories on natural language processing and utilization of machine learning techniques for investments and at Infoscience on a highly distributed log data collection and analysis platform. Mateusz loves all things distributed and machine learning and hates buzzwords. In his spare time, he participates in the IT community by organizing, attending, and speaking at conferences and meetups. Mateusz holds an MSc in computer science from AGH UST in Krakow.

Connect with O'ReillyData

Use the QR Code to follow OReillyData and get the latest conference information and browse data articles.

WeChat QRcode


Stay Connected Image 1
Stay Connected Image 3
Stay Connected Image 2

Read the latest ideas on big data.

ORB Data Site