Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Schedule: Model lifecycle management sessions

Companies are realizing that machine learning model development is not quite the same as software development. Completion of the ML model building process doesn’t automatically translate to a working system. The data community is still in the process of building tools to help manage the entire lifecycle which also includes model deployment, monitoring, and operations. While tools and best practices are just beginning to emerge and be shared, model lifecycle management is one of the most active areas in the data space.

9:00am–12:30pm Tuesday, 09/11/2018
Location: 1E 06 Level: Intermediate
Dan Crankshaw (UC Berkeley RISELab)
Average rating: *****
(5.00, 1 rating)
Dan Crankshaw offers an overview of the current challenges in deploying machine applications into production and the current state of prediction serving infrastructure. He then leads a deep dive into the Clipper serving system and shows you how to get started. Read more.
1:30pm–5:00pm Tuesday, 09/11/2018
Location: 1E 09 Level: Intermediate
Brian Foo (Google), Holden Karau (Google), Jay Smith (Google)
Average rating: **...
(2.00, 7 ratings)
TensorFlow and Keras are popular libraries for training deep models due to hardware accelerator support. Brian Foo, Jay Smith, and Holden Karau explain how to bring deep learning models from training to serving in a cloud production environment. You'll learn how to unit-test, export, package, deploy, optimize, serve, monitor, and test models using Docker and TensorFlow Serving in Kubernetes. Read more.
1:15pm–1:55pm Wednesday, 09/12/2018
Location: 1A 08 Level: Beginner
William Benton (Red Hat)
Average rating: *****
(5.00, 2 ratings)
Containers are a hot technology for application developers, but they also provide key benefits for data scientists. William Benton details the advantages of containers for data scientists and AI developers, focusing on high-level tools that will enable you to become more productive and collaborate more effectively. Read more.
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1E 14 Level: Intermediate
David Talby (Pacific AI)
Average rating: ****.
(4.40, 5 ratings)
Machine learning and data science systems often fail in production in unexpected ways. David Talby shares real-world case studies showing why this happens and explains what you can do about it, covering best practices and lessons learned from a decade of experience building and operating such systems at Fortune 500 companies across several industries. Read more.
2:05pm–2:45pm Wednesday, 09/12/2018
Location: 1A 08 Level: Beginner
Atul Kale (Airbnb), Xiaohan Zeng (Airbnb)
Average rating: *****
(5.00, 3 ratings)
Atul Kale and Xiaohan Zeng offer an overview of Bighead, Airbnb's user-friendly and scalable end-to-end machine learning framework that powers Airbnb's data-driven products. Built on Python, Spark, and Kubernetes, Bighead integrates popular libraries like TensorFlow, XGBoost, and PyTorch and is designed be used in modular pieces. Read more.
2:05pm–2:45pm Wednesday, 09/12/2018
Location: Expo Hall
Mani Parkhe (Databricks), Andrew Chen (Databricks)
Successfully building and deploying a machine learning model is difficult to do once. Enabling other data scientists to reproduce your pipeline, compare the results of different versions, track what's running where, and redeploy and rollback updated models is much harder. Mani Parkhe and Andrew Chen offer an overview of MLflow—a new open source project from Databricks that simplifies this process. Read more.
4:35pm–5:15pm Wednesday, 09/12/2018
Location: 1E 09 Level: Intermediate
Dave Shuman (Cloudera), Bryan Dean (Red Hat)
The focus on the IoT is turning increasingly to the edge, and the way to make the edge more intelligent is by building machine learning models in the cloud and pushing them back out to the edge. Dave Shuman and Bryan Dean explain how Cloudera and Red Hat executed this architecture at one of Europe's leading manufacturers, along with a demo highlighting this architecture. Read more.
5:25pm–6:05pm Wednesday, 09/12/2018
Location: 1A 21/22 Level: Intermediate
Jay Kreps (Confluent)
Average rating: ****.
(4.00, 2 ratings)
Machine learning has become mainstream, and suddenly businesses everywhere are looking to build systems that use it to optimize aspects of their product, processes or customer experience. Jay Kreps explores some of the difficulties of building production machine learning systems and explains how Apache Kafka and stream processing can help. Read more.
5:25pm–6:05pm Wednesday, 09/12/2018
Location: 1E 10/11 Level: Intermediate
Diego Oppenheimer (Algorithmia)
Average rating: ****.
(4.50, 2 ratings)
After big investments in collecting and cleaning data and building machine learning (ML) models, enterprises face big challenges in deploying models to production and managing a growing portfolio of ML models. Diego Oppenheimer covers the strategic and technical hurdles each company must overcome and the best practices developed while deploying over 4,000 ML models for 70,000 engineers. Read more.
1:10pm–1:50pm Thursday, 09/13/2018
Location: 1A 10 Level: Intermediate
Wangda Tan (Hortonworks)
Average rating: ****.
(4.50, 2 ratings)
In order to train deep learning and machine learning models, you must leverage applications such as TensorFlow, MXNet, Caffe, and XGBoost. Wangda Tan discusses new features in Apache Hadoop 3.x to better support deep learning workloads and demonstrates how to run these applications on YARN. Read more.
2:00pm–2:40pm Thursday, 09/13/2018
Location: Expo Hall Level: Intermediate
Chris Fregly (PipelineAI)
Average rating: ***..
(3.50, 2 ratings)
Chris Fregly details a full-featured, open source end-to-end TensorFlow model training and deployment system, using the latest advancements with Kubernetes, TensorFlow, and GPUs. Read more.
2:00pm–2:40pm Thursday, 09/13/2018
Location: 1A 10 Level: Intermediate
Michelle Casbon (Google)
Average rating: *****
(5.00, 2 ratings)
Michelle Casbon demonstrates how to build a machine learning application with Kubeflow. Kubeflow makes it easy for everyone to develop, deploy, and manage portable, scalable ML everywhere and supports the full lifecycle of an ML product, including iteration via Jupyter notebooks. Join Michelle to find out what Kubeflow currently supports and the long-term vision for the project. Read more.
4:20pm–5:00pm Thursday, 09/13/2018
Location: 1A 08 Level: Intermediate
Harish Doddi (Datatron Technologies), Jerry Xu (Datatron Technologies)
Large financial institutions have many data science teams (e.g., those for fraud, credit risk, and marketing), each often using diverse set of tools to build predictive models. There are many challenges involved in productionizing these predictive AI models. Harish Doddi and Jerry Xu share challenges and lessons learned deploying AI models to production in large financial institutions. Read more.