Presented By O'Reilly and Cloudera
Make Data Work
March 13–14, 2017: Training
March 14–16, 2017: Tutorials & Conference
San Jose, CA

How Microsoft predicts churn of cloud customers using deep learning and explains those predictions in an interpretable way

Feng Zhu (Microsoft), Valentine Fontama (Microsoft)
11:00am11:40am Wednesday, March 15, 2017
Data science & advanced analytics
Location: 210 C/G Level: Intermediate
Secondary topics:  Deep learning, ecommerce, Retail
Average rating: ****.
(4.71, 7 ratings)

Who is this presentation for?

  • Data scientists and data analysts with machine-learning knowledge

Prerequisite knowledge

  • An intermediate or advanced understanding of machine learning, analytics, and data science (useful but not required)

What you'll learn

  • Understand how Microsoft uses machine-learning models to identify churning customers and make the scores interpretable to stakeholders with no ML background using LIME, a novel explanation technique published in KDD 2016
  • Learn best practices for building a churn predictive model using deep learning algorithms and techniques—specifically using deep DNN and RNN to build classification models, taking advantage of both static and dynamic (time series) features
  • Learn how the business value of the model can be estimated for potential stakeholders in a data-driven way


Churn prediction and prevention is a critical component of CRM for Microsoft’s cloud business. Since churn is a rare event and churn patterns may vary significantly across customers, predicting churn is a challenging task when using conventional machine-learning techniques. On the other hand, Microsoft has massive, rich customer usage and billing data, which enables it to exploit advanced machine-learning techniques to discover the complex usage patterns for churn.

Feng Zhu and Val Fontama explore how Microsoft’s cloud business built a deep learning-based churn predictive model in partnership with the Deep Learning team at Microsoft Research. The model identifies which customers are at high risk of churning from Microsoft Cloud. In order to fully utilize customer data, the team built a hybrid deep learning architecture by using deep DNN layers and deep RNN (LSTM) layers, which can take both static features and dynamic time series data as inputs. Compared to the current churn model in production, the deep learning model significantly improved prediction accuracy and demonstrated higher business impact.

In addition to prediction accuracy, another challenge in churn prediction is how to explain the model and predictions to end users with no machine-learning background. End users (i.e., marketing and sales teams) need to understand the model and predictions in order adopt it and take actions on customers. Since deep learning models (as well as random forest models) are black box models, explaining why the score for a specific customer is high or low in an interpretable way is a challenge. Feng and Val demonstrate how to use LIME, a new algorithm published in KDD 1016, to explain the predictions of any classifier or regressor in a faithful way by approximating it locally with an interpretable model. Using this algorithm, you can explain how different features contribute to the predicted churn score (from the DL model) for each individual customer.

Topics include:

  • Areas of improvement for the churn model in production
  • How deep learning can be applied to the churn model
  • Using LIME to explain the DL model and prediction
  • Estimating the value of the model and business impact
Photo of Feng Zhu

Feng Zhu


Feng Zhu is a data scientist at C+E Analytics and Insights within Microsoft, where he focuses on building end-to-end solutions for various problems in the Microsoft Cloud business using advanced machine-learning techniques. Previously, Feng was a research scientist on the Fraud Detection and Risk Management team at Amazon, where he collaborated with various business and engineering teams to provide fraud detection and mitigation solutions for the Pay with Amazon product. He holds a PhD in electrical engineering and MS degrees in electrical engineering and applied mathematics from the University of Notre Dame and a BS from Harbin Institute of Technology, China.

Photo of Valentine Fontama

Valentine Fontama


Valentine Fontama is a principal data scientist manager on Microsoft’s Analytics + Insights Data Science team that delivers analytics capabilities across Azure and C+E cloud services. Previously, he was a new technology consultant at Equifax in London, where he pioneered the use of data mining to improve risk assessment and marketing in the consumer credit industry; principal data scientist in the Data & Decision Sciences Group (DDSG), where he led consulting to external customers, including ThyssenKrupp and Dell; and a senior product manager for big data and predictive analytics in cloud and enterprise marketing at Microsoft, where he led product management for Azure Machine Learning, HDInsight, Parallel Data Warehouse (Microsoft’s first ever data warehouse appliance), and three releases of Fast Track Data Warehouse. He has published 11 academic papers and coauthored three books on big data: Predictive Analytics with Microsoft Azure Machine Learning: Build and Deploy Actionable Solutions in Minutes (2 editions) and Introducing Microsoft Azure HDInsight. Val holds an MBA in strategic management and marketing from the Wharton School, a PhD in neural networks, an MS in computing, and a BS in mathematics and electronics.

Comments on this page are now closed.


Picture of Feng Zhu
03/19/2017 10:11am PDT

Thanks Alex for your interest. You can find our slides here:

Alex Rozinov |
03/17/2017 7:51am PDT

Is it possible to obtain a copy of the slides for this presentation?