October 28–31, 2019
Please log in

Advanced model deployments with TensorFlow Serving

Hannes Hapke (SAP ConcurLabs)
11:50am12:30pm Wednesday, October 30, 2019
Location: Grand Ballroom C/D
Average rating: *****
(5.00, 2 ratings)

Who is this presentation for?

  • Machine learning engineers, DevOps engineers, and data scientists interested in deploying machine learning models




TensorFlow Serving is one of the cornerstones in the TensorFlow ecosystem. It has eased the deployment of machine learning models tremendously and led to an acceleration of model deployments. Unfortunately, machine learning engineers aren’t familiar with the details of TensorFlow Serving, and they’re missing out on significant performance increases.

Hannes Hapke provides a brief introduction to TensorFlow Serving, then leads a deep dive into advanced settings and use cases. He introduces advanced concepts and implementation suggestions to increase the performance of the TensorFlow Serving setup, which includes an introduction to how clients can request model meta-information from the model server, an overview of model optimization options for optimal prediction throughput, an introduction to batching requests to improve the throughput performance, an example implementation to support model A/B testing, and an overview of monitoring your TensorFlow Serving setup.

Prerequisite knowledge

  • A basic understanding of Docker functionality and how HTTP requests work
  • General knowledge of machine learning (useful but not required)

What you'll learn

  • Learn how to increase the TensorFlow Serving inference performance, increase the inference response time by tweaking the request payload, and run TensorFlow Serving with NVIDIA's TensorRT for further performance improvements
  • Discover how to configure batch requests in TensorFlow Serving and how to configure TensorFlow Serving to provide A/B Testing capabilities
Photo of Hannes Hapke

Hannes Hapke

SAP ConcurLabs

Hannes Hapke is a senior data scientist at SAP ConcurLabs. He’s been a machine learning enthusiast for many years and is a Google Developer Expert for machine learning. Hannes has applied deep learning to a variety of computer vision and natural language problems, but his main interest is in machine learning engineering and automating model workflows. Hannes is a coauthor of the deep learning publication Natural Language Processing in Action and he’s working on a book about Building Machine Learning Pipelines with TensorFlow Extended (O’Reilly). When he isn’t working on a deep learning project, you’ll find him outdoors running, hiking, or enjoying a good cup of coffee with a great book.

  • O'Reilly
  • TensorFlow
  • Google Cloud
  • IBM
  • Databricks
  • Tensor Networks
  • VMware
  • Amazon Web Services
  • One Convergence
  • Quantiphi
  • Lambda Labs
  • Tech Mahindra
  • cnvrg.io
  • Determined AI
  • Inferencery
  • Manceps, Inc.
  • PerceptiLabs
  • Valohai

Contact us


For conference registration information and customer service


For more information on community discounts and trade opportunities with O’Reilly conferences


For information on exhibiting or sponsoring a conference


For media/analyst press inquires