Continuous Delivery for Machine Learning: Automating the end-to-end lifecycle
Who is this presentation for?Data scientists or analysts
Releasing Machine Learning systems into production is harder than traditional software. They are non-deterministic, hard to test, hard to explain, and hard to improve. You are not finished when you find your first working model; you also need to think about things like integration, testing, deployment, scaling, and monitoring. What’s more, after launch, you will want to continuously adapt and improve your model to respond to the changing environment.
ThoughtWorks pioneered Continuous Delivery, and have now further developed it to overcome the challenges associated with Machine Learning systems, and calls this new approach Continuous Delivery for Machine Learning (CD4ML). CD4ML is a software engineering approach in which a cross-functional team produces machine learning applications based on code, data, and models in small and safe increments that can be reproduced and reliably released at any time, in short adaptation cycles.
In this hands-on training, We will demonstrate how to apply CD4ML. Using a real Machine Learning application in a live scenario, you will learn how to:
- Create your deployment pipelines;
- Version your model training workflow to make it reproducible;
- Improve your model in a development environment, test its performance, and depending on the outcome, automatically deploy the new model into a production environment;
- Track model performance across various experiments; and
- Monitor and observe your model in production to close the data feedback loop.
The tech stack for this scenario will be Python with scikit-learn, DVC (Data Science Version Control), mlflow, GoCD, Docker, Git, ElasticSearch, FluentD, Kibana, and Google Cloud Platform.
Prerequisite knowledgeBasic knowledge of developing ML models (preferably in Python), source control with Git, and using Docker for local development.
Materials or downloads needed in advance
What you'll learn
Danilo Sato is a principal consultant at ThoughtWorks with more than 15 years of experience in many areas of architecture and engineering: software, data, infrastructure, and machine learning. Balancing strategy with execution, Danilo helps clients refine their technology strategy while adopting practices to reduce the time between having an idea, implementing it, and running it in production using the cloud, DevOps, and continuous delivery. He is the author of DevOps in Practice: Reliable and Automated Software Delivery, is a member of ThoughtWorks’ Technology Advisory Board and Office of the CTO, and is an experienced international conference speaker.
Leave a Comment or Question
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
Premier Diamond Sponsor
Premier Exhibitor Plus
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
For media/analyst press inquires