Fueling innovative software
July 15-18, 2019
Portland, OR

Machine learning infrastructure at GitHub using Kubernetes

Michal Jastrzebski (GitHub), Hamel Husain (GitHub)
2:00pm2:30pm Tuesday, July 16, 2019
ML Ops Day
Location: E145/146
Average rating: ****.
(4.50, 2 ratings)

Who is this presentation for?

  • DevOps, machine learning engineers, and data scientists




Michal Jastrzębski and Hamel Husain show you how GitHub used an end-to-end machine learning product that was open-sourced that closely mirrors machine learning infrastructure and tools that it uses internally. They walk you through how GitHub uses Kubernetes, Dask, and TensorFlow to build an end-to-end product that cleans data, builds a model, and serves an API that handles production and live traffic.

Code and materials will be provided so everything discussed in the talk can be replicated for learning purposes.

Prerequisite knowledge

  • A working knowledge of Docker and Kubernetes (useful but not required)

What you'll learn

  • Learn how Kubernetes is used for machine learning ops
Photo of Michal Jastrzebski

Michal Jastrzebski


Michał Jastrzębski is staff data engineer at GitHub, where he builds machine learning infrastructure for internal use. Previously, he was an architect at Intel’s Open Source Technology Center. Michał has a long experience in cloud technologies like OpenStack and Kubernetes, both as an operator and contributor. As former leader of OpenStack Kolla, he managed a community of more than 200 people and almost 40 companies. Michal has been involved with machine learning on Kubernetes communities like Kubeflow.

Photo of Hamel Husain

Hamel Husain


Hamel Husain is a data scientist at GitHub who is focused on creating the next generation of developer tools powered by machine learning. His work involves extensive use of natural language and deep learning techniques to extract features from code and text. Previously, Hamel was a data scientist at Airbnb, where he worked on growth marketing, and at DataRobot, where he helped build automated machine learning tools for data scientists. Hamel can be reached on Twitter.