Get the free Ebook:
Private and Open Data in Asia: A Regional Guide.
In this tutorial, you will create end-to-end predictive models based on an extensive library of machine learning algorithms included in Microsoft Azure Machine Learning studio with its R and Python language extensibility. You will then deploy and consume the model and use it for making predictions over business data. You will be walking through the typical steps of performing machine learning, which can be summarized as: data ingestion, cleaning and data exploration, feature engineering, model selection, and evaluation of results.
As a part of the tutorial, you will be creating two different models based on a fictitious dataset. This dataset consists of records belonging to the customers of a telecom service provider. The columns of the dataset hold information such as the length of customer account, total day, night, evening, and international minutes used.
The first model you will create is called churn analysis, or customer attrition, which is the problem of identifying the customers who are likely to leave a service or a business. The goal of the analysis is to contact these high-risk individuals and take necessary actions such as providing special offers and discounts to prevent them from leaving the business. You will model the problem using the binary classification technique. You will then create a web service for the model and visualize the classification results.
Additionally, a second model you can choose to create is a segmentation model where the objective is to find natural clusters of customers within the data sets who have similar characteristics. This is also extremely beneficial to understand the customer base for targeted marketing applications, where the goal is to target the right individuals in order to grow the business.
Danielle Dean is a principal data scientist lead at Microsoft in the Algorithms and Data Science Group within the Artificial Intelligence and Research Division, where she leads a team of data scientists and engineers building predictive analytics and machine learning solutions with external companies utilizing Microsoft’s Cloud AI Platform. Previously, she was a data scientist at Nokia, where she produced business value and insights from big data through data mining and statistical modeling on data-driven projects that impacted a range of businesses, products, and initiatives. Danielle holds a PhD in quantitative psychology from the University of North Carolina at Chapel Hill, where she studied the application of multilevel event history models to understand the timing and processes leading to events between dyads within social networks.
Wee Hyong Tok is a principal data science manager at Microsoft, where he works with teams to cocreate new value and turn each of the challenges facing organizations into compelling data stories that can be concretely realized using proven enterprise architecture. Wee Hyong has worn many hats in his career, including developer, program and product manager, data scientist, researcher, and strategist, and his range of experience has given him unique superpowers to nurture and grow high-performing innovation teams that enable organizations to embark on their data-driven digital transformations using artificial intelligence. He strongly believes in story-driven innovation and has a passion for leading artificial intelligence-driven innovations and working with teams to envision how these innovations can create new competitive advantage and value for their business. He coauthored one of the first books on Azure Machine Learning, Predictive Analytics Using Azure Machine Learning, and authored another demonstrating how database professionals can do AI with databases, Doing Data Science with SQL Server.
Comments on this page are now closed.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.