Get the free Ebook:
Private and Open Data in Asia: A Regional Guide.
In this tutorial, you will create end-to-end predictive models based on an extensive library of machine learning algorithms included in Microsoft Azure Machine Learning studio with its R and Python language extensibility. You will then deploy and consume the model and use it for making predictions over business data. You will be walking through the typical steps of performing machine learning, which can be summarized as: data ingestion, cleaning and data exploration, feature engineering, model selection, and evaluation of results.
As a part of the tutorial, you will be creating two different models based on a fictitious dataset. This dataset consists of records belonging to the customers of a telecom service provider. The columns of the dataset hold information such as the length of customer account, total day, night, evening, and international minutes used.
The first model you will create is called churn analysis, or customer attrition, which is the problem of identifying the customers who are likely to leave a service or a business. The goal of the analysis is to contact these high-risk individuals and take necessary actions such as providing special offers and discounts to prevent them from leaving the business. You will model the problem using the binary classification technique. You will then create a web service for the model and visualize the classification results.
Additionally, a second model you can choose to create is a segmentation model where the objective is to find natural clusters of customers within the data sets who have similar characteristics. This is also extremely beneficial to understand the customer base for targeted marketing applications, where the goal is to target the right individuals in order to grow the business.
Danielle Dean is a principal data scientist lead in AzureCAT within the Cloud AI Platform Division at Microsoft, where she leads an international team of data scientists and engineers to build predictive analytics and machine learning solutions with external companies utilizing Microsoft’s Cloud AI Platform. Previously, she was a data scientist at Nokia, where she produced business value and insights from big data through data mining and statistical modeling on data-driven projects that impacted a range of businesses, products, and initiatives. Danielle holds a PhD in quantitative psychology from the University of North Carolina at Chapel Hill, where she studied the application of multilevel event history models to understand the timing and processes leading to events between dyads within social networks.
Wee Hyong Tok is a principal data science manager with Microsoft. Wee Hyong has worn many hats in his career, including developer, program and product manager, data scientist, researcher, and strategist, and his track record of leading successful engineering and data science teams has given him unique superpowers to be a trusted AI advisor to customers. Wee Hyong coauthored several books on artificial intelligence, including Predictive Analytics Using Azure Machine Learning and Doing Data Science with SQL Server. Wee Hyong holds a PhD in computer science from the National University of Singapore.
Comments on this page are now closed.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.