Sep 23–26, 2019

Schedule: Model Development, Governance, Operations sessions

Companies are realizing that machine learning model development is not quite the same as software development. Completion of the ML model building process doesn’t automatically translate to a working system. The data community is still in the process of building tools to help manage the entire lifecycle which also includes model deployment, monitoring, and operations. While tools and best practices are just beginning to emerge and be shared, model lifecycle management is one of the most active areas in the data space.

Add to your personal schedule
9:00am12:30pm Tuesday, September 24, 2019
Location: 1A 12/14
Sourav Dey (Manifold), Jakov Kucan (Manifold)
In this tutorial, we will walk through the six steps of our Lean AI process and explain how they help your ML engineers work as an an integrated part of your development and production teams. We will also walk through a hands-on example using real-world data from one of our client companies, so you can get up and running with Docker and Orbyter and see first-hand how streamlined they can make... Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, September 24, 2019
Location: 1A 21/22
Jules Damji (Databricks)
ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, September 24, 2019
Location: 1E 10
Boris Lublinsky (Lightbend), Dean Wampler (Lightbend)
This hands-on tutorial examines production use of ML in streaming data pipelines; how to do periodic model retraining and low-latency scoring in live streams. We'll discuss Kafka as the data backplane, pros and cons of microservices vs. systems like Spark and Flink, tips for Tensorflow and SparkML, performance considerations, model metadata tracking, and other techniques. Read more.
Add to your personal schedule
11:20am12:00pm Wednesday, September 25, 2019
Location: 1E 10/11
David Talby (Pacific AI)
Machine learning and data science systems often fail in production in unexpected ways. David Talby shares real-world case studies showing why this happens and explains what you can do about it, covering best practices and lessons learned from a decade of experience building and operating such systems at Fortune 500 companies across several industries. Read more.
Add to your personal schedule
5:25pm6:05pm Wednesday, September 25, 2019
Location: 1A 06/07
The common perception of deep learning is that it results in a fully self-contained model. However, in most cases these models have similar requirements for data pre-processing as more "traditional" machine learning. Despite this, there are few standard solutions for deploying end-to-end deep learning. In this talk, I show how the ONNX format and ecosystem is addressing this challenge. Read more.
Add to your personal schedule
1:15pm1:55pm Thursday, September 26, 2019
Location: 1A 21/22
Jim Scott (NVIDIA)
Data scientists are creating and testing hundreds or thousands more models than in the past. Models require support from both real-time and static data sources. As data becomes enriched, and parameters tuned and explored, there is a need for versioning everything, including the data. We will discuss the very specific problems and approaches to fix them. Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 26, 2019
Location: 1A 21/22
Diego Oppenheimer (Algorithmia)
Machine Learning (ML) will fundamentally change the way we build and maintain applications. How can we adapt our infrastructure, operations, staffing, and training to meet the challenges of the new Software Development Life Cycle (SDLC) without throwing away everything that already works? Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 26, 2019
Location: 1A 12/14
Andrew Leamon (Comcast), Wadkar Sameer (Comcast NBCUniversal)
And overview of the Data Management and privacy challenges around automating ML model (re)deployments and stream based inferencing at scale. Read more.
Add to your personal schedule
3:45pm4:25pm Thursday, September 26, 2019
Location: 1A 21/22
Sireesha Muppala (Amazon Web Services), Shelbee Eigenbrode (Amazon Web Services), Randall DeFauw (Amazon Web Services)
As an increasing level of automation is becoming available to data science, there is a balance between automation and quality that needs to be maintained. Applying DevOps practices to machine learning workloads not only brings models to the market faster but also maintains the quality and integrity of those models. This presentation will focus on applying DevOps practices to ML workloads. Read more.
Add to your personal schedule
4:35pm5:15pm Thursday, September 26, 2019
Location: 1E 07/08
Evgeny Vinogradov (Yandex.Money)
With a microservice architecture, DWH is a first place where all the data gets together. It supplied by many different datasources. It is used for many purposes – from near-OLTP till models fitting and realtime classifying. Talk will cover our experience in management and scaling of data Engineering Team and infrastructure for support of 20+ Product Teams. Read more.

    Contact us

    confreg@oreilly.com

    For conference registration information and customer service

    partners@oreilly.com

    For more information on community discounts and trade opportunities with O’Reilly conferences

    strataconf@oreilly.com

    For information on exhibiting or sponsoring a conference

    Contact list

    View a complete list of Strata Data Conference contacts