Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

MLflow: An open platform to simplify the machine learning lifecycle

Mani Parkhe (Databricks), Andrew Chen (Databricks)
2:05pm–2:45pm Wednesday, 09/12/2018
Secondary topics:  Model lifecycle management

What you'll learn

  • MLflow, a new open source project from Databricks that simplifies the process of managing the ML lifecycle

Description

Successfully building and deploying a machine learning model is difficult to do once. Enabling other data scientists (or even yourself, one month later) to reproduce your pipeline, compare the results of different versions, track what’s running where, and redeploy and rollback updated models is much harder.

Mani Parkhe and Andrew Chen offer an overview of MLflow—a new open source project from Databricks that simplifies this process. MLflow provides APIs for tracking experiment runs between multiple users within a reproducible environment and for managing the deployment of models to production. Moreover, MLflow is designed to be an open, modular platform, in the sense that you can use it with any existing ML library and incorporate it incrementally into an existing ML development process.

Photo of Mani Parkhe

Mani Parkhe

Databricks

Mani Parkhe is an ML and AI platform engineer at Databricks, where he works on various customer-facing and open source platform initiatives to enable data discovery, training, experimentation, and deployment of ML models in the cloud. Mani is a lifelong student and coding geek with a passion for elegance in design. Previously, he spent 15 years building software for semiconductor chip CAD before transitioning to building big data infrastructure, distributed systems and web services, and machine learning. He also worked on various data intensive batch and stream processing problems at LinkedIn and Uber. Mani holds a master’s degree in CS from the University of Florida. He lives in Almaden Valley with his wife and three amazing kids.

Photo of Andrew Chen

Andrew Chen

Databricks

Andrew Chen is a software engineer at Databricks and a MLflow committer. At Databricks, Andrew is working on tools to simplify the end to end experience of machine learning, all the way from data ETL to model training and deployment. Before working at Databricks, Andrew received his BS in EECS from UC Berkeley in 2016. While in school, Andrew also briefly worked on search quality at Pinterest and search engine marketing at Groupon.