Fueling innovative software
July 15-18, 2019
Portland, OR

End-to-end ML streaming with Kubeflow, Kafka, and Redis at scale

Nick Pinckernell (Comcast)
4:05pm4:40pm Tuesday, July 16, 2019
ML Ops Day, Sponsored
Location: E145/146
Average rating: ****.
(4.50, 2 ratings)

Who is this presentation for?

  • Machine learning engineers, software developers, and data scientists interested in model serving




At scale, large solutions are sometimes required to tackle even the smallest tasks, and ML is no different. Comcast is building architectures to handle end-to-end ML pipelining and deployments.

Nick Pinckernell outlines a solution that demonstrates configuration-based, continuously integrated and deployed solutions to handle data transformation, normalization, and model serving. This is accomplished using a range of tools and frameworks such as Kubernetes, Apache Spark, and more. It all starts with a large Apache Spark environment used by many researchers to explore and train models. The researchers are then empowered to develop simple or complex model graphs and deploy themselves using Kubeflow and Seldon Core. Data streams into these models using Apache Kafka with windowing and aggregation handled by Redis.

You’ll gain an understanding of the architecture, configuration, and technologies that are involved in making this happen at scale. Nick provides specific examples and flows of requests to example models to demonstrate all the necessary components and configuration.

Prerequisite knowledge

  • Familiarity with Python, JavaScript object notation (JSON)/YAML, Kubernetes, basic types of models, pub/sub architectures, and key-value stores

What you'll learn

  • See an example use case of model pipelining and serving
  • Take a moderately deep dive into Kubeflow, specifically Seldon Core
  • Learn about Kafka or pub/sub and NoSQL data stores, as well as some Kubernetes
Photo of Nick Pinckernell

Nick Pinckernell


Nick Pinckernell is a senior research engineer for the applied AI research team at Comcast, where he works on ML platforms for model serving and feature pipelining. He has focused on software development, big data, distributed computing, and research in telecommunications for many years. He’s pursuing his MS in computer science at the University of Illinois at Urbana-Champaign, and when free, he enjoys IoT.