Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Schedule: Model lifecycle management sessions

Add to your personal schedule
9:00 - 17:00 Monday, 29 April & Tuesday, 30 April
Data Science, Machine Learning & AI
Location: London Suite 3
Amir Issaei (Databricks)
The course covers the fundamentals of neural networks and how to build distributed Keras/TensorFlow models on top of Spark DataFrames. Throughout the class, you will use Keras, TensorFlow, Deep Learning Pipelines, and Horovod to build and tune models. You will also use MLflow to track experiments and manage the machine learning lifecycle. NOTE: This course is taught entirely in Python. Read more.
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
Danilo Sato (ThoughtWorks), Christoph Windheuser (ThoughtWorks Inc.)
In this workshop, we will present how to apply the concept of Continuous Delivery (CD) - which ThoughtWorks pioneered - to data science and machine learning. It allows data scientists to make changes to their models, while at the same time safely integrating and deploying them into production, using testing and automation techniques to release reliably at any time and with a high frequency. Read more.
Add to your personal schedule
9:0012:30 Tuesday, 30 April 2019
Data Science, Machine Learning & AI
Location: Capital Suite 2/3
Holden Karau (Google), Trevor Grant (IBM), Ilan Filonenko (Bloomberg LP), Francesca Lazzeri (Microsoft)
This workshop will quickly introduce what Kubeflow is, and how we can use it to train and serve models across different cloud environments (and on-prem). We’ll have a script to do the initial set up work ready so you can jump (almost) straight into training a model on one cloud, and then look at how to set up serving in another cluster/cloud. We will start with a simple model w/follow up links. Read more.
Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Streaming and IoT
Location: Capital Suite 2/3
Boris Lublinsky (Lightbend), Dean Wampler (Lightbend)
This hands-on tutorial examines production use of ML in streaming data pipelines; how to do periodic model retraining and low-latency scoring in live streams. We'll discuss Kafka as the data backplane, pros and cons of microservices vs. systems like Spark and Flink, tips for Tensorflow and SparkML, performance considerations, model metadata tracking, and other techniques. Read more.
Add to your personal schedule
14:5515:35 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Mark Grover (Lyft), Deepak Tiwari (Lyft)
Lyft’s data platform is at the heart of Lyft’s business. Decisions all the way from pricing, to ETA, to business operations rely on Lyft’s data platform. Moreover, it powers the enormous scale and speed at which Lyft operates. In this talk, Mark Grover walks through various choices Lyft has made in the development and sustenance of the data platform and why along with what lies ahead in future. Read more.
Add to your personal schedule
14:5515:35 Wednesday, 1 May 2019
Arun Kejariwal (Independent), Karthik Ramasamy (Streamlio)
In this talk, we shall walk the audience through an architecture whereby models are served in real-time and the models are updated, using Apache Pulsar, without restarting the application at hand. Further, we will describe how Pulsar functions can be applied to support two example use cases, viz., sampling and filtering. We shall lead the audience through a concrete case study of the same. Read more.
Add to your personal schedule
16:3517:15 Wednesday, 1 May 2019
Arif Wider (ThoughtWorks), Emily Gorcenski (ThoughtWorks)
Machine learning can be challenging to deploy and maintain. Data change, and both models and the systems that implement them must be able to adapt. Any delays moving models from research to production means leaving your data scientists' best work on the table. In this talk, we explore continuous delivery (CD) for AI/ML, and explore case studies for applying CD principles to data science workflows. Read more.
Add to your personal schedule
12:0512:45 Thursday, 2 May 2019
Data Engineering and Architecture
Location: Capital Suite 7
Kai Wähner (Confluent)
How can you leverage the flexibility and extreme scale in public cloud combined with Apache Kafka ecosystem to build scalable, mission-critical machine learning infrastructures, which span multiple public clouds or bridge your on-premise data centre to cloud? Join this talk to learn how to apply technologies such as TensorFlow with Kafka’s open source ecosystem for machine learning infrastructures Read more.