Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Machine learning for preventive maintenance of mining haul trucks

Alex Gorbachev (Pythian), Paul Spiegelhalter (Pythian)
4:40pm5:20pm Thursday, March 28, 2019
Average rating: ****.
(4.67, 3 ratings)

Who is this presentation for?

  • Data scientists, machine learning engineers, and equipment maintenance analysts



Prerequisite knowledge

  • A basic understanding of data science and machine learning

What you'll learn

  • Learn how to identify the correct ML approach for preventive maintenance
  • Explore labeling and algorithm strategies when faced with high dimensional inputs and few target samples
  • Understand how to do feature engineering from IoT data


Building machine learning solutions means a lot more than just training models from a neatly organized dataset magically provided to you. Creating preventive maintenance AI is no exception.

Supervised learning technique for haul trucks’ preventive maintenance involves high-dimensional inputs, complex reasons for truck failures, and very few truck failure samples. Having a large volume of data with a low number of target instances creates the danger of poor model generalization, either from lack of model complexity or from model overfitting. Using this example, Alex Gorbachev and Paul Spiegelhalter explain how to map preventive maintenance needs to supervised machine learning problems, create labeled datasets, do feature engineering from sensors and alerts data, evaluate models—then convert it all to a complete AI solution on Google Cloud Platform that’s integrated with existing on-premises systems. Join in to learn strategies for dealing with feature engineering in situations where you have complex inputs in high volume, complex situations to model, and very few labeled targets to work with.

Topics include:

  • Expressing needs to predict failures and recommend maintenance as supervised machine learning problems
  • Typical data sources needed for success
  • Turning historical data into training and testing examples
  • Practical labeling strategies
  • Feature engineering approaches for a large number of input sensors and alarms
  • Disciplined performance evaluation
Photo of Alex Gorbachev

Alex Gorbachev


Alex Gorbachev is the head of enterprise data science at Pythian. His mission is to help clients around the world build applied AI solutions and democratize data science. Over the course of his 12 years at Pythian, Alex has held many roles, including chief technology officer and chief digital officer. His deep technological roots and industry vision has helped Pythian get to the forefront of the emerging cloud and data markets. Alex is a highly sought-after speaker at industry conferences and user groups around the world. His past accomplishments include achieving the prestigious Oracle ACE Director designation from Oracle and being named “Big Data Champion” by Cloudera.

Photo of Paul Spiegelhalter

Paul Spiegelhalter


Paul Spiegelhalter is a data scientist and deep learning specialist at Pythian, where he’s recognized for his deep expertise in utilizing cutting-edge advances in artificial intelligence and machine learning in order to transform groundbreaking research into usable algorithms. Paul’s experience with predictive analytics and algorithmic modeling runs across a number of industries, including computer vision, predictive maintenance, online advertising and user analysis, medical diagnostics, natural language processing, and anomaly detection. He holds a PhD in mathematics from the University of Illinois at Urbana-Champaign.