Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK
Holden Karau

Holden Karau
Software Engineer, Independent

@holdenkarau

Holden Karau is a transgender Canadian open source developer advocate at Google focusing on Apache Spark, Beam, and related big data tools. Previously, she worked at IBM, Alpine, Databricks, Google (yes, this is her second time), Foursquare, and Amazon. Holden is the coauthor of Learning Spark, High Performance Spark, and another Spark book that’s a bit more out of date. She is a committer on the Apache Spark, SystemML, and Mahout projects. When not in San Francisco, Holden speaks internationally about different big data technologies (mostly Spark). She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal. Outside of work, she enjoys playing with fire, riding scooters, and dancing.

Sessions

9:0012:30 Tuesday, 30 April 2019
Data Science, Machine Learning & AI
Location: Capital Suite 15
Holden Karau (Independent), Trevor Grant (IBM), Francesca Lazzeri (Microsoft)
Average rating: ****.
(4.43, 7 ratings)
Holden Karau, Francesca Lazzeri, and Trevor Grant offer an overview of Kubeflow and walk you through using it to train and serve models across different cloud environments (and on-premises). You'll use a script to do the initial setup work, so you can jump (almost) straight into training a model on one cloud and then look at how to set up serving in another cluster/cloud. Read more.
14:5515:35 Wednesday, 1 May 2019
Holden Karau (Independent), Mikayla Konst (Google), Ben Sidhom (Google)
Average rating: ***..
(3.75, 4 ratings)
As more workloads move to severless-like environments, the importance of properly handling downscaling increases. Holden Karau, Mikayla Konst, and Ben Sidhom explore approaches for improving the scale-down experience on open source cluster managers—everything from how to schedule jobs to the location of blocks and their impact. Read more.
14:0514:45 Thursday, 2 May 2019
Data Engineering and Architecture, Expo Hall
Location: Expo Hall 2 (Capital Hall N24)
Holden Karau (Independent), Kris Nova (Independent)
Average rating: ****.
(4.86, 7 ratings)
In the Kubernetes world, where declarative resources are a first-class citizen, running complicated workloads across distributed infrastructure is easy, and processing big data workloads using Spark is common practice, we can finally look at constructing a hybrid system of running Spark in a distributed cloud native way. Join respective experts Kris Nova and Holden Karau for a fun adventure. Read more.