Presented By O’Reilly and Cloudera

San Francisco • London • New York

Make Data Work

21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Machine learning platform lifecycle management

Hope Wang (Intuit)

14:55–15:35 Thursday, 24 May 2018

Data engineering and architecture, Data-driven business management
Location: Capital Suite 7 Level: Intermediate

Secondary topics: Financial Services, Managing and Deploying Machine Learning

Average rating:

(4.00, 3 ratings)

Download slides (1-PPTX)

Download slides (2-PPTX)

Who is this presentation for?

Software architects, engineers, and data scientists

Prerequisite knowledge

Basic knowledge of machine learning development

What you'll learn

Learn how to build and virtualize end-to-end lifecycle management
Explore the components of a machine learning platform and learn how different components associate and interact
Understand how to execute and manage in a production environment
Explore a case study of taking a model through the deployment process

Description

Data science and machine learning are critical enabling factors for data-driven organizations. There has been an exponential rise of expectations put on engineering organizations to meet the demand to develop and scale machine learning capabilities. A machine learning platform is not just the sum of its parts; the key is how it supports the model lifecycle end to end. This includes data discovery, feature engineering, iterative model development, model training, and model scoring (batch and online). The management of artifacts, their associations, and deployment across various platform components is vital.

While there are a number of mature technologies that support each phase of this lifecycle, there are limited solutions available that tie these components together into a cohesive machine learning platform. To support the lifecycle of a model, you must be able to manage the various ML-related artifacts and their associations and automate deployment. A lifecycle management service built for this purpose should be leveraged for storage, versioning, visualizing (including associations), and deployment of artifacts. The platform should support model development in different programming languages, and language and package versions should be configured specific to a model. Having the custom environment follows the model through the lifecycle is important to guarantee model always run in the same environment. Thus, the environment should be externalized, associated, and deployed together with a model. Other considerations include the connection between various artifacts and platforms:, the data and datasets (source data and feature data, training datasets, and scoring result sets), the code (notebook code, model code, deployment code, etc.), model-specific environments, and platforms (developing and training platforms, batch and online scoring platforms).

Hope Wang explains how her team at Intuit is managing the machine learning lifecycle, how different components associate and interact with each other, and how to execute in a production environment. Hope then shares an example of how an integrated process was developed for data engineers and data scientists to manage the entire lifecycle of a model from ideation through development, training, and ultimately, scoring.

Hope Wang

Intuit

Hope Wang is a software engineer in Intuit’s Small Business Data and Analytics Group. Hope is a self-taught, self-motivated, fully powered hacker who is passionate about innovation. She holds a master’s degree in biomedical engineering from the University of Southern California.

Website

Presented by

Elite Sponsors

Exabyte Sponsor

Impact Sponsors

Supporting Sponsor

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com