Sep 23–26, 2019

Schedule: Model Development, Governance, Operations sessions

Companies are realizing that machine learning model development is not quite the same as software development. Completion of the ML model building process doesn’t automatically translate to a working system. The data community is still in the process of building tools to help manage the entire lifecycle which also includes model deployment, monitoring, and operations. While tools and best practices are just beginning to emerge and be shared, model lifecycle management is one of the most active areas in the data space.

Add to your personal schedule
9:00am12:30pm Tuesday, September 24, 2019
Location: 1A 12/14
Sourav Dey (Manifold), Jakov Kucan (Manifold)
Sourav Dey and Jakov Kucan walk you through the six steps of the Lean AI process and explain how it helps your ML engineers work as an an integrated part of your development and production teams. You'll get a hands-on example using real-world data, so you can get up and running with Docker and Orbyter and see firsthand how streamlined they can make your workflow. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, September 24, 2019
Location: 1A 21
Jules Damji (Databricks)
ML development brings many new complexities beyond the software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information. Jules Damji walks you through MLflow, an open source project that simplifies the entire ML lifecycle, to solve this problem. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, September 24, 2019
Location: 1E 15/16
Boris Lublinsky (Lightbend), Dean Wampler (Anyscale)
Boris Lublinsky and Dean Wampler examine ML use in streaming data pipelines, how to do periodic model retraining, and low-latency scoring in live streams. Learn about Kafka as the data backplane, the pros and cons of microservices versus systems like Spark and Flink, tips for TensorFlow and SparkML, performance considerations, metadata tracking, and more. Read more.
Add to your personal schedule
11:20am12:00pm Wednesday, September 25, 2019
Location: 1A 21/22
Evgeny Vinogradov (Yandex.Money)
With a microservice architecture, a data warehouse is the first place where all the data meets. It's supplied by many different data sources and used for many purposes—from near-online transactional processing (OLTP) to model fitting and real-time classifying. Evgeny Vinogradov details his experience in managing and scaling data for support of 20+ product teams. Read more.
Add to your personal schedule
11:20am12:00pm Wednesday, September 25, 2019
Location: 1E 10/11
David Talby (Pacific AI)
Machine learning and data science systems often fail in production in unexpected ways. David Talby outlines real-world case studies showing why this happens and explains what you can do about it, covering best practices and lessons learned from a decade of experience building and operating such systems at Fortune 500 companies across several industries. Read more.
Add to your personal schedule
5:25pm6:05pm Wednesday, September 25, 2019
Location: 1A 06/07
The common perception of deep learning is that it results in a fully self-contained model. However, in most cases, these models have similar requirements for data preprocessing as does more "traditional" machine learning. Despite this, there are few standard solutions for deploying end-to-end deep learning. Nick Pentreath explores how the ONNX format and ecosystem addresses this challenge. Read more.
Add to your personal schedule
5:25pm6:05pm Wednesday, September 25, 2019
Location: 1E 06
venkata gunnu (Comcast), Harish Doddi (Datatron)
Machine learning infrastructure is key to the success of AI at scale in enterprises, with many challenges when you want to bring machine learning models to a production environment, given the legacy of the enterprise environment. Venkata Gunnu and Harish Doddi explore some key insights, what worked, what didn't work, and best practices that helped the data engineering and data science teams. Read more.
Add to your personal schedule
1:15pm1:55pm Thursday, September 26, 2019
Location: 1A 21/22
Jim Scott (NVIDIA)
Data scientists create and test hundreds or thousands more models than in the past. Models require support from both real-time and static data sources. As data becomes enriched, and parameters tuned and explored, there's a need for versioning everything, including the data. Jim Scott examines the very specific problems and approaches to fix them. Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 26, 2019
Location: 1A 21/22
Diego Oppenheimer (Algorithmia)
Machine learning (ML) will fundamentally change the way we build and maintain applications. Diego Oppenheimer dives into how you can adapt your infrastructure, operations, staffing, and training to meet the challenges of the new software development life cycle (SDLC) without throwing away everything that already works. Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 26, 2019
Location: 1A 12/14
Mumin Ransom (Comcast), Nick Pinckernell (Comcast)
Mumin Ransom gives an overview of the data management and privacy challenges around automating ML model (re)deployments and stream-based inferencing at scale. Read more.
Add to your personal schedule
3:45pm4:25pm Thursday, September 26, 2019
Location: 1A 21/22
Sireesha Muppala (Amazon Web Services), Shelbee Eigenbrode (Amazon Web Services), Randall DeFauw (Amazon Web Services)
As an increasing level of automation becomes available to data science, the balance between automation and quality needs to be maintained. Applying DevOps practices to machine learning workloads brings models to the market faster and maintains the quality and integrity of those models. Sireesha Muppala, Shelbee Eigenbrode, and Randall DeFauw explore applying DevOps practices to ML workloads. Read more.
  • Cloudera
  • O'Reilly
  • Google Cloud
  • IBM
  • Cisco
  • Dataiku
  • Intel
  • Io-Tahoe
  • MemSQL
  • Microsoft Azure
  • Oracle Cloud Infrastructure
  • SAS
  • Arcadia Data
  • BMC Software
  • Hazelcast
  • SAP
  • Amazon Web Services
  • Anaconda
  • Esri
  • Infoworks.io, Inc.
  • Kyligence
  • Pitney Bowes
  • Talend
  • Google Cloud
  • Confluent
  • DataStax
  • Dremio
  • Immuta
  • Impetus Technologies Inc.
  • Keyence
  • Kyvos Insights
  • StreamSets
  • Striim
  • Syncsort
  • SK holdings C&C

    Contact us

    confreg@oreilly.com

    For conference registration information and customer service

    partners@oreilly.com

    For more information on community discounts and trade opportunities with O’Reilly conferences

    strataconf@oreilly.com

    For information on exhibiting or sponsoring a conference

    pr@oreilly.com

    For media/analyst press inquires