Presented By O'Reilly and Cloudera
Make Data Work
Feb 17–20, 2015 • San Jose, CA

Machine Learning with H2O and Spark

Cliff Click (0xdata), Michal Malohlava (0xdata, Inc)
2:20pm–3:00pm Thursday, 02/19/2015
Spark in Action
Location: 210 C/G
Average rating: ***..
(3.77, 13 ratings)

H2O – It’s open-source in-memory big-data clustered computing – Math At Scale. We got the Worlds Fastest Logistic Regression (by a lot!), Distributed Deep Learning, world’s first (and fastest) distributed Gradient Boosted Method (GBM), plus Random Forest, PCA, KMeans++, and much more.

We fully integrate with Spark’s RDD, allowing the strength and flexibility of Spark’s data munging to mix with H2O’s world-class analytics.

This talk will demonstrate the two platforms working together to solve problems neither can solve alone.

Photo of Cliff Click

Cliff Click


Cliff Click is the CTO and Co-Founder of 0xdata, a firm dedicated to creating a new way to think about web-scale math and real-time analytics. I wrote my first compiler when I was 15 (Pascal to TRS Z-80!), although my most famous compiler is the HotSpot Server Compiler (the Sea of Nodes IR). I helped Azul Systems build an 864 core pure-Java mainframe that keeps GC pauses on 500Gb heaps to under 10ms, and worked on all aspects of that JVM. Before that I worked on HotSpot at Sun Microsystems, and am at least partially responsible for bringing Java into the mainstream.

I am invited to speak regularly at industry and academic conferences and has published many papers about HotSpot technology. I hold a PhD in Computer Science from Rice University and about 15 patents.

Photo of Michal Malohlava

Michal Malohlava

0xdata, Inc

Michal is a geek, developer, Java, Linux, programming languages enthusiast developing software for over 10 years.
He obtained PhD from the Charles University in Prague in 2012 and post-doc at Purdue University.

During his studies he was interested in construction of not only distributed but also embedded and real-time component-based systems using model-driven methods and domain-specific languages. He participated in design and development of various systems including SOFA and Fractal component systems or jPapabench control system.

Comments on this page are now closed.


Picture of Michal Malohlava
Michal Malohlava
02/25/2015 11:37am PST

Hi there,

we have published a blog post about conference and our presentation at H2O blog – you can find it here:

It contains:
– presentation deck
– Python notebook and raw Python code
– Scala code for Sparkling Water demo
– and data sources

Please, feel free to give us any feedback or comments!

- Michal

Rajesh Haran
02/19/2015 11:30am PST

can you pl. post the slides and the materials pl?