Presented By O’Reilly and Intel AI
Put AI to work
Sep 4-5, 2018: Training
Sep 5-7, 2018: Tutorials & Conference
San Francisco, CA

AI on Kubernetes

Daniel Whitenack (Pachyderm)
9:00am-12:30pm Wednesday, September 5, 2018
Implementing AI, Interacting with AI
Location: Continental 4 Level: Intermediate
Secondary topics:  Platforms and infrastructure
Average rating: ****.
(4.75, 4 ratings)

Who is this presentation for?

  • Data scientists, data engineers, AI researchers, machine learning engineers, and DevOps engineers

Prerequisite knowledge

  • Experience implementing ML in real-world environments on real-world data
  • Familiarity with the pain points around deployment and scaling
  • A basic understanding of the command line and typical ML workflows for training and inference (If you are new to the command line or need a refresher, look through this quick tutorial.)

Materials or downloads needed in advance

  • A laptop with the ability to SSH into a remote machine (On a macOS or Linux machine, you should be able to SSH from a terminal; on a Windows machine, you can either install and use an SSH client like PuTTY or use the WSL.)

What you'll learn

  • Understand what Kubernetes is and what advantages it offers
  • Learn how to deploy machine learning training and inference on a Kubernetes cluster in the cloud and utilize tools such as Pachyderm and KubeFlow for data and pipeline management


It’s no secret that machine learning workflows are awkward to deploy and hard to maintain and often cause friction with engineering and IT teams. Frequently, work done by data scientists and machine learning researchers is wasted because it never escapes their laptops or cannot be scaled to larger data.

Kubernetes—the container orchestration engine used by all of the top technology companies, including Google, Amazon, and Microsoft—was built from the ground up to run and manage highly distributed workloads on huge clusters. Thus, it provides a solid foundation for model development.

Daniel Whitenack demonstrates how to easily deploy and scale AI/ML workflows on any infrastructure using Kubernetes. You’ll learn how to containerize and deploy model training and inference on Kubernetes using popular open source tools like Pachyderm and KubeFlow and discover how to ingress/egress data, use version models, utilize GPUs, and track and evaluate models.


  • Why Kubernetes? An introduction to Kubernetes and its use for ML development pipelines and deployments
  • An example workflow that uses TensorFlow for model training and inference
  • Managing data: Strategies for data management paired with Kubernetes; deploying strategy, based on object storage, with example training/inference data
  • Deploying training/inference: Deploying TensorFlow training and inference on top of Kubernetes
  • Wrap-up and Q&A: Additional resources and how to manage advanced workflows (e.g., those that utilize GPUs)
Photo of Daniel Whitenack

Daniel Whitenack


Daniel Whitenack is a PhD-trained data scientist and engineer at Pachyderm. His industry experience includes developing data science applications, such as predictive models, dashboards, recommendation engines, and more, for large and small companies. Daniel has spoken at conferences around the world, including Applied ML Days, Spark Summit, PyCon, ODSC, and GopherCon. He maintains the Go kernel for Jupyter and is actively helping to organize contributions to various open source data science projects.