Presented By O'Reilly and Cloudera
Make Data Work
5–7 May, 2015 • London, UK
Sean Owen

Sean Owen
Director of Data Science, Cloudera

Website | @sean_r_owen

Sean is director of data science for EMEA at Cloudera. Previously, Sean founded Myrrix Ltd, producing a real-time recommender and clustering product evolved from Mahout. Myrrix is now part of Cloudera. Sean was a primary author of recommender components in Apache Mahout, and has been a committer and PMC member for the project. He is co-author of Advanced Analytics on Spark and Mahout in Action. Sean was previously a senior engineer at Google.


10:55–11:35 Thursday, 7/05/2015
Data Science
Location: King's Suite - Balmoral
Sean Owen (Cloudera)
Average rating: ****.
(4.94, 17 ratings)
Apache Spark has a lot to like for the data scientist: natively distributed, REPL, Scala and Python APIs, and a machine learning library, MLlib. Spark 1.2 includes an implementation of random decision forests, an important classifier/regressor algorithm. This talk will introduce Spark, Scala, and random decision forests, and demonstrate the process of analyzing a real-world data set with them. Read more.