Presented By O'Reilly and Cloudera
Make Data Work
Dec 4–5, 2017: Training
Dec 5–7, 2017: Tutorials & Conference

R you ready for the cloud? Using R for operationalizing an enterprise-grade data science solution on Azure

Le Zhang (Microsoft), Graham Williams (Microsoft)
5:05pm5:45pm Thursday, December 7, 2017
Average rating: ***..
(3.00, 1 rating)

Who is this presentation for?

  • Data scientists, data engineers, technical program managers, and cloud solution architects

Prerequisite knowledge

  • A working knowledge of R, big data computing platforms (e.g., Hadoop and Spark), machine learning and deep learning, and cloud computing

What you'll learn

  • Learn how to construct an end-to-end Azure cloud-based analytical pipeline and do scalable data science seamlessly on the pipeline, all by using R


R is leading in the list of most popular data science languages, with 49% share of the overall voting according to a recent survey. However, the language is by nature limited in scalability and parallelism, and thus, restrained for wide deployment in enterprise-grade applications. Contemporary big data solutions are migrating from on-premises to the cloud, owing to apparent benefits of flexibility in scaling up/out resources, computational efficiency, and cost effectiveness. To better leverage the advantages of cloud computing and smooth the process of embracing the cloud, the community needs R packages as well as associated paradigms that allow R-user data scientists and data engineers to operationalize enterprise-grade pipeline for analytical solution development.

Le Zhang and Graham Williams demonstrate how to use R for architecting enterprise-grade data analytic solutions and developing artificial intelligence applications on Azure cloud. Le and Graham explore a real-world scenario about flight delay prediction to illustrate how R is used to elastically deploy, manage, and deallocate a heterogeneous set of cloud instances, such as virtual machine, Spark clusters, and storage accounts, and distribute on-demand parallel and scalable data analytics with the cutting-edge machine learning technologies in the cloud. The R packages introduced remarkably simplify the management and use of cloud resources for various big data tasks and therefore accelerate the pace of prototyping, experimenting, and productizing data-driven solutions for enterprise use.

Photo of Le Zhang

Le Zhang


Le Zhang is a data scientist with Microsoft Cloud and Artificial Intelligence, where he applies cutting-edge machine learning and artificial intelligence technology to accelerate digital transformation for enterprises and startups on cloud. He’s helped numerous corporations develop and build enterprise-grade scalable advanced data analytical systems with a broad spectrum of application scenarios like manufacturing, predictive maintenance, financial services, ecommerce, and human resource analytics. Le specializes in cloud computing, big data technologies, and artificial intelligence. He enjoys sharing knowledge and learning from people and is a frequent speaker at industrial and academic conferences and community meetups. He holds a PhD in computer engineering.

Photo of Graham Williams

Graham Williams


Graham Williams is director of data science at Microsoft, where he is responsible for the Asia-Pacific region, an adjunct professor with the University of Canberra and the Australian National University, and an international visiting professor with the Chinese Academy of Sciences. Graham has 30 years’ experience as a data scientist leading research and deployments in artificial intelligence, machine learning, data mining, and analytics. Previously, he was principal data scientist with the Australian Taxation Office and lead data scientist with the Australian Government’s Centre of Excellence in Analytics, where he assisted numerous government departments and Australian industry in creating and building data science capabilities. He has also worked on many projects focused on delivering solutions and applications driven by data using machine learning and artificial intelligence technologies. Graham has authored a number of books introducing data mining and machine learning using the R statistical software.