Sep 9–12, 2019

Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)

Holden Karau (Google), Trevor Grant (IBM)
1:45pm2:25pm Thursday, September 12, 2019
Location: LL21 E/F

Who is this presentation for?

Data Scientists / Data Engineers.




Data Science, Machine Learning, and Artificial Intelligence has exploded in popularity in the last five years, but the nagging question remains, “How to put models into production?” Engineers are typically tasked to build one-off systems to serve predictions which must be maintained amid a quickly evolving back-end serving space which has evolved from single-machine, to custom clusters, to “serverless”, to Docker, to Kubernetes. In this talk, we present KubeFlow- an open source project which makes it easy for users to move models from laptop to ML Rig to training cluster to deployment. In this talk we will discuss, “What is KubeFlow?”, “why scalability is so critical for training and model deployment?”, and other topics.

Users can deploy models written in Python’s skearn, R, Tensorflow, Spark, and many more. The magic of Kubernetes allows data scientists to write models on their laptop, deploy to an ML-Rig, and then devOps can move that model into production with all of the bells and whistles such as monitoring, A/B tests, multi-arm bandits, and security.

Prerequisite knowledge

Passing familiarity w K8s, TensorFlow (or other ML libs)

What you'll learn

What is Kubeflow? Model training workflows. Deploying models to production.
Photo of Holden Karau

Holden Karau


Holden Karau is a transgender Canadian open source developer advocate at Google focusing on Apache Spark, Beam, and related big data tools. Previously, she worked at IBM, Alpine, Databricks, Google (yes, this is her second time), Foursquare, and Amazon. Holden is the coauthor of Learning Spark, High Performance Spark, and another Spark book that’s a bit more out of date. She is a committer on the Apache Spark, SystemML, and Mahout projects. When not in San Francisco, Holden speaks internationally about different big data technologies (mostly Spark). She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal. Outside of work, she enjoys playing with fire, riding scooters, and dancing.

Photo of Trevor Grant

Trevor Grant


Trevor Grant is committer on the Apache Mahout, and contributor on Apache Streams (incubating), Apache Zeppelin, and Apache Flink projects and Open Source Technical Evangelist at IBM. In former rolls he called himself a data scientist, but the term is so over used these days. He holds an MS in Applied Math and an MBA from Illinois State University. Trevor is an organizer of the newly formed Chicago Apache Flink Meet Up, and has presented at Flink Forward, ApacheCon, Apache Big Data, and other meetups nationwide.

Trevor was a combat medic in Afghanistan in 2009, and wrote an award winning undergraduate thesis between missions. He has a dog and a cat and a 64 Ford and he loves them all very much.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

For conference registration information and customer service

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of O'Reilly AI contacts