Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Schedule: Model lifecycle management sessions

Add to your personal schedule
9:00 - 17:00 Monday, 29 April & Tuesday, 30 April
Data Science, Machine Learning & AI
Location: Capital Suite 17
Amir Issaei (Databricks)
Average rating: *****
(5.00, 1 rating)
Join Amir Issaei to explore neural network fundamentals and learn how to build distributed Keras/TensorFlow models on top of Spark DataFrames. You'll use Keras, TensorFlow, Deep Learning Pipelines, and Horovod to build and tune models and MLflow to track experiments and manage the machine learning lifecycle. This course is taught entirely in Python. Read more.
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
Danilo Sato (ThoughtWorks), Christoph Windheuser (ThoughtWorks)
Average rating: ****.
(4.31, 13 ratings)
Danilo Sato and Christoph Windheuser walk you through applying continuous delivery (CD), pioneered by ThoughtWorks, to data science and machine learning. Join in to learn how to make changes to your models while safely integrating and deploying them into production, using testing and automation techniques to release reliably at any time and with a high frequency. Read more.
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Data Science, Machine Learning & AI
Location: Capital Suite 15
Holden Karau (Google), Trevor Grant (IBM), Francesca Lazzeri (Microsoft)
Average rating: ****.
(4.43, 7 ratings)
Holden Karau, Francesca Lazzeri, and Trevor Grant offer an overview of Kubeflow and walk you through using it to train and serve models across different cloud environments (and on-premises). You'll use a script to do the initial setup work, so you can jump (almost) straight into training a model on one cloud and then look at how to set up serving in another cluster/cloud. Read more.
Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Streaming and IoT
Location: Capital Suite 10
Boris Lublinsky (Lightbend), Dean Wampler (Lightbend)
Average rating: ****.
(4.20, 5 ratings)
Boris Lublinsky and Dean Wampler walk you through using ML in streaming data pipelines and doing periodic model retraining and low-latency scoring in live streams. You'll explore using Kafka as a data backplane, the pros and cons of microservices versus systems like Spark and Flink, tips for TensorFlow and SparkML, performance considerations, model metadata tracking, and other techniques. Read more.
Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 2/3
Harish Doddi (Datatron Technologies), Jerry Xu (Datatron Technologies)
Average rating: *****
(5.00, 1 rating)
Harish Doddi and Jerry Xu share the challenges they faced scaling machine learning models and detail the solutions they're building to conquer them. Read more.
Add to your personal schedule
14:5515:35 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Mark Grover (Lyft), Deepak Tiwari (Lyft)
Average rating: ****.
(4.69, 13 ratings)
Lyft’s data platform is at the heart of the company's business. Decisions from pricing to ETA to business operations rely on Lyft’s data platform. Moreover, it powers the enormous scale and speed at which Lyft operates. Mark Grover and Deepak Tiwari walk you through the choices Lyft made in the development and sustenance of the data platform, along with what lies ahead in the future. Read more.
Add to your personal schedule
14:5515:35 Wednesday, 1 May 2019
Arun Kejariwal (Independent), Karthik Ramasamy (Streamlio)
Average rating: ***..
(3.00, 1 rating)
Arun Kejariwal and Karthik Ramasamy walk you through an architecture in which models are served in real time and the models are updated, using Apache Pulsar, without restarting the application at hand. They then describe how to apply Pulsar functions to support two example use—sampling and filtering—and explore a concrete case study of the same. Read more.
Add to your personal schedule
16:3517:15 Wednesday, 1 May 2019
Arif Wider (ThoughtWorks), Emily Gorcenski (ThoughtWorks)
Average rating: ***..
(3.90, 10 ratings)
Machine learning can be challenging to deploy and maintain. Any delays in moving models from research to production mean leaving your data scientists' best work on the table. Arif Wider and Emily Gorcenski explore continuous delivery (CD) for AI/ML along with case studies for applying CD principles to data science workflows. Read more.
Add to your personal schedule
12:0512:45 Thursday, 2 May 2019
Data Engineering and Architecture, Expo Hall
Location: Expo Hall 2 (Capital Hall N24)
Kai Wähner (Confluent)
Average rating: ****.
(4.75, 8 ratings)
How do you leverage the flexibility and extreme scale of the public cloud and the Apache Kafka ecosystem to build scalable, mission-critical machine learning infrastructures that span multiple public clouds—or bridge your on-premises data center to the cloud? Join Kai Wähner to learn how to use technologies such as TensorFlow with Kafka’s open source ecosystem for machine learning infrastructures. Read more.