Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA
Holden Karau

Holden Karau
Software Engineer, Independent


Holden Karau is a transgender Canadian software engineer working in the bay area. Previously, she worked at IBM, Alpine, Databricks, Google (twice), Foursquare, and Amazon. Holden is the coauthor of Learning Spark, High Performance Spark, and another Spark book that’s a bit more out of date. She’s a committer on the Apache Spark, SystemML, and Mahout projects. When not in San Francisco, Holden speaks internationally about different big data technologies (mostly Spark). She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal. Outside of work, she enjoys playing with fire, riding scooters, and dancing.


5:10pm–5:50pm Wednesday, 03/30/2016
Spark & Beyond

Location: 210 A/E
Holden Karau (Independent)
Average rating: ****.
(4.11, 19 ratings)
Apache Spark is a fast, general engine for big data processing. As Spark jobs are used for more mission-critical tasks, it is important to have effective tools for testing and validation. Holden Karau details reasonable validation rules for production jobs and best practices for creating effective tests, as well as options for generating test data. Read more.
1:50pm–2:30pm Thursday, 03/31/2016
Office Hours

Location: Table A (O'Reilly Booth)
Holden Karau (Independent)
If you’re interested in testing and validating Spark programs, you need to talk to Holden. She’ll answer your questions about things like high-performance Spark, using Spark with non-JVM languages (e.g., Python), and contributing to Spark. Read more.