Churn prediction and prevention is a critical component of CRM for Microsoft’s cloud business. Since churn is a rare event and churn patterns may vary significantly across customers, predicting churn is a challenging task when using conventional machine-learning techniques. On the other hand, Microsoft has massive, rich customer usage and billing data, which enables it to exploit advanced machine-learning techniques to discover the complex usage patterns for churn.
Feng Zhu and Val Fontama explore how Microsoft’s cloud business built a deep learning-based churn predictive model in partnership with the Deep Learning team at Microsoft Research. The model identifies which customers are at high risk of churning from Microsoft Cloud. In order to fully utilize customer data, the team built a hybrid deep learning architecture by using deep DNN layers and deep RNN (LSTM) layers, which can take both static features and dynamic time series data as inputs. Compared to the current churn model in production, the deep learning model significantly improved prediction accuracy and demonstrated higher business impact.
In addition to prediction accuracy, another challenge in churn prediction is how to explain the model and predictions to end users with no machine-learning background. End users (i.e., marketing and sales teams) need to understand the model and predictions in order adopt it and take actions on customers. Since deep learning models (as well as random forest models) are black box models, explaining why the score for a specific customer is high or low in an interpretable way is a challenge. Feng and Val demonstrate how to use LIME, a new algorithm published in KDD 1016, to explain the predictions of any classifier or regressor in a faithful way by approximating it locally with an interpretable model. Using this algorithm, you can explain how different features contribute to the predicted churn score (from the DL model) for each individual customer.
Feng Zhu is a data scientist at C+E Analytics and Insights within Microsoft, where he focuses on building end-to-end solutions for various problems in the Microsoft Cloud business using advanced machine-learning techniques. Previously, Feng was a research scientist on the Fraud Detection and Risk Management team at Amazon, where he collaborated with various business and engineering teams to provide fraud detection and mitigation solutions for the Pay with Amazon product. He holds a PhD in electrical engineering and MS degrees in electrical engineering and applied mathematics from the University of Notre Dame and a BS from Harbin Institute of Technology, China.
Valentine Fontama is a principal data scientist manager on Microsoft’s Analytics + Insights Data Science team that delivers analytics capabilities across Azure and C+E cloud services. Previously, he was a new technology consultant at Equifax in London, where he pioneered the use of data mining to improve risk assessment and marketing in the consumer credit industry; principal data scientist in the Data & Decision Sciences Group (DDSG), where he led consulting to external customers, including ThyssenKrupp and Dell; and a senior product manager for big data and predictive analytics in cloud and enterprise marketing at Microsoft, where he led product management for Azure Machine Learning, HDInsight, Parallel Data Warehouse (Microsoft’s first ever data warehouse appliance), and three releases of Fast Track Data Warehouse. He has published 11 academic papers and coauthored three books on big data: Predictive Analytics with Microsoft Azure Machine Learning: Build and Deploy Actionable Solutions in Minutes (2 editions) and Introducing Microsoft Azure HDInsight. Val holds an MBA in strategic management and marketing from the Wharton School, a PhD in neural networks, an MS in computing, and a BS in mathematics and electronics.
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.