Michal Jastrzębski and Hamel Husain show you how GitHub used an end-to-end machine learning product that was open-sourced that closely mirrors machine learning infrastructure and tools that it uses internally. They walk you through how GitHub uses Kubernetes, Dask, and TensorFlow to build an end-to-end product that cleans data, builds a model, and serves an API that handles production and live traffic.
Code and materials will be provided so everything discussed in the talk can be replicated for learning purposes.
Michał Jastrzębski is staff data engineer at GitHub, where he builds machine learning infrastructure for internal use. Previously, he was an architect at Intel’s Open Source Technology Center. Michał has a long experience in cloud technologies like OpenStack and Kubernetes, both as an operator and contributor. As former leader of OpenStack Kolla, he managed a community of more than 200 people and almost 40 companies. Michal has been involved with machine learning on Kubernetes communities like Kubeflow.
Hamel Husain is a data scientist at GitHub who is focused on creating the next generation of developer tools powered by machine learning. His work involves extensive use of natural language and deep learning techniques to extract features from code and text. Previously, Hamel was a data scientist at Airbnb, where he worked on growth marketing, and at DataRobot, where he helped build automated machine learning tools for data scientists. Hamel can be reached on Twitter.
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org