Get the free Ebook:
Private and Open Data in Asia: A Regional Guide.
In this tutorial, you will create end-to-end predictive models based on an extensive library of machine learning algorithms included in Microsoft Azure Machine Learning studio with its R and Python language extensibility. You will then deploy and consume the model and use it for making predictions over business data. You will be walking through the typical steps of performing machine learning, which can be summarized as: data ingestion, cleaning and data exploration, feature engineering, model selection, and evaluation of results.
As a part of the tutorial, you will be creating two different models based on a fictitious dataset. This dataset consists of records belonging to the customers of a telecom service provider. The columns of the dataset hold information such as the length of customer account, total day, night, evening, and international minutes used.
The first model you will create is called churn analysis, or customer attrition, which is the problem of identifying the customers who are likely to leave a service or a business. The goal of the analysis is to contact these high-risk individuals and take necessary actions such as providing special offers and discounts to prevent them from leaving the business. You will model the problem using the binary classification technique. You will then create a web service for the model and visualize the classification results.
Additionally, a second model you can choose to create is a segmentation model where the objective is to find natural clusters of customers within the data sets who have similar characteristics. This is also extremely beneficial to understand the customer base for targeted marketing applications, where the goal is to target the right individuals in order to grow the business.
Danielle Dean is a senior data scientist lead at Microsoft in the Algorithms and Data Science group within Cloud and Enterprise, where she leads a team of data scientists and engineers on end-to-end analytics projects using Microsoft’s Cortana Intelligence Suite—from automating the ingestion of data to analysis and implementation of algorithms, creating web services of these implementations, and using those to integrate into customer solutions or build end-user dashboards and visualizations. Danielle holds a PhD in quantitative psychology from the University of North Carolina at Chapel Hill, where she studied the application of multilevel event history models to understand the timing and processes leading to events between dyads within social networks.
Wee Hyong Tok is a principal data science manager at Microsoft, where he works with teams to cocreate new value and turn each of the challenges facing organizations into compelling data stories that can be concretely realized using proven enterprise architecture. Wee Hyong has worn many hats in his career, including developer, program and product manager, data scientist, researcher, and strategist, and his range of experience has given him unique superpowers to nurture and grow high-performing innovation teams that enable organizations to embark on their data-driven digital transformations using artificial intelligence. He has a passion for leading artificial intelligence-driven innovations and working with teams to envision how these innovations can create new competitive advantage and value for their business. He strongly believes in story-driven innovation.
Comments on this page are now closed.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.