Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA
Holden Karau

Holden Karau
Software Engineer, Independent

@holdenkarau

Holden Karau is a transgender Canadian software working in the bay area. Previously, she worked at IBM, Alpine, Databricks, Google (twice), Foursquare, and Amazon. Holden is the coauthor of Learning Spark, High Performance Spark, and another Spark book that’s a bit more out of date. She’s a committer on the Apache Spark, SystemML, and Mahout projects. When not in San Francisco, Holden speaks internationally about different big data technologies (mostly Spark). She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal. Outside of work, she enjoys playing with fire, riding scooters, and dancing.

Sessions

1:30pm5:00pm Tuesday, March 26, 2019
Holden Karau (Independent), Francesca Lazzeri (Microsoft), Trevor Grant (IBM)
Average rating: ***..
(3.00, 2 ratings)
Holden Karau, Francesca Lazzeri, and Trevor Grant offer an overview of Kubeflow and walk you through using it to train and serve models across different cloud environments (and on-premises). You'll use a script to do the initial setup work, so you can jump (almost) straight into training a model on one cloud and then look at how to set up serving in another cluster/cloud. Read more.
4:40pm5:20pm Thursday, March 28, 2019
Holden Karau (Independent), Rachel Warren (Salesforce Einstein)
Average rating: ****.
(4.60, 5 ratings)
Apache Spark is an amazing distributed system, but part of the bargain we've made with the infrastructure deamons involves providing the correct set of magic numbers (a.k.a. tuning) or our jobs may be eaten by Cthulhu. Holden Karau and Rachel Warren explore auto-tuning jobs using systems like Apache BEAM, Mahout, and internal Spark ML jobs as workloads—including new settings in 2.4. Read more.