Presented By O'Reilly and Cloudera
Make Data Work
22–23 May 2017: Training
23–25 May 2017: Tutorials & Conference
London, UK

Speeding up machine-learning applications with the LightGBM library in real-time domains

Mathew Salvaris (Microsoft), Miguel Gonzalez-Fierro (Microsoft)
11:3012:00 Tuesday, 23 May 2017
Level: Intermediate

In some applications, training and retraining times need to be kept below five seconds to be useful. Such applications are often referred to as real time and include, but are not limited, to the IoT, sport result prediction, predictive maintenance, and health care. Algorithms that allow for fast retraining are fundamental to enabling such applications and can open up new business opportunities. One reason for retraining is that the features used in these applications can degrade, causing previously useful features to no longer be useful. Such degradation is often observed as sensors age or as information becomes out of date.

LightGBM is a new open source library created by Microsoft that is set to become the new standard in decision tree algorithms. Depending on the application, it can be anything from 4 to 10 times faster than XGBoost and offers a higher accuracy. It has already been proven useful in several Kaggle competitions. Mathew Salvaris and Miguel González-Fierro explore this promising library, compare it with the current state of the art, and demo a business case of a real-time application.

Photo of Mathew Salvaris

Mathew Salvaris

Microsoft

Mathew Salvaris is a data scientist at Microsoft. Previously, Mathew was a data scientist for a small startup that provided analytics for fund managers and a postdoctoral researcher at UCL’s Institute of Cognitive Neuroscience, where he worked with Patrick Haggard in the area of volition and free will, devising models to decode human decisions in real time from the motor cortex using electroencephalography (EEG). He also held a postdoctoral position in the University of Essex’s Brain Computer Interface Group, where he worked on BCIs for computer mouse control. Mathew holds a PhD in brain computer interfaces and an MSc in distributed artificial intelligence.

Photo of Miguel Gonzalez-Fierro

Miguel Gonzalez-Fierro

Microsoft

Miguel González-Fierro is a senior data scientist at Microsoft UK, where he helps customers leverage their processes using big data and machine learning. Previously, he was CEO and founder of Samsamia Technologies, a company that created a visual search engine for fashion items, allowing users to find products using images instead of words, and founder of the Robotics Society of Universidad Carlos III, which developed different projects related to UAVs, mobile robots, small humanoids competitions, and 3D printers. Miguel also worked as a robotics scientist at Universidad Carlos III of Madrid and King’s College London, where his research focused on learning from demonstration, reinforcement learning, computer vision, and dynamic control of humanoid robots. He holds a BSc and MSc in electrical engineering and an MSc and PhD in robotics.

Comments on this page are now closed.

Comments

Picture of Miguel Gonzalez-Fierro
Miguel Gonzalez-Fierro | SENIOR DATA SCIENTIST
26/05/2017 10:04 BST

Here you can find the slides of the talk: https://www.slideshare.net/MiguelFierro1/speeding-up-machinelearning-applications-with-the-lightgbm-library
Also here there is the github repo with all the experiments: https://github.com/Azure/fast_retraining/

Michal Kucharczyk | BI & RISK MANAGEMENT SPECIALIST
26/05/2017 9:35 BST

Hello Mathew and Miguel, do you plan to share the slides?