Sean Owen

Sean Owen
Director of Data Science, Cloudera

Website | @sean_r_owen

Sean is Director of Data Science for EMEA at Cloudera, helping customers build large-scale machine learning solutions on Hadoop. Previously, Sean founded Myrrix Ltd, producing a real-time recommender and clustering product evolved from Mahout. Myrrix is now part of Cloudera. Sean was primary author of recommender components in Apache Mahout, and has been an active committer and PMC member for the project. He is co-author of Mahout in Action.


Data Science
Location: 113
Sean Owen (Cloudera)
Average rating: ****.
(4.00, 20 ratings)
Apache Spark is a popular new paradigm for computation on Hadoop. It's particularly effective for iterative algorithms relevant to data science like clustering, which can be used to detect anomalies in data. Curious? Get a taste of Spark MLlib, Scala and k-means clustering in this walkthrough of anomaly detection as applied to network intrusion, using the KDD Cup '99 data set. Read more.
Office Hours
Location: Table A
Sean Owen (Cloudera)
If you want to use Apache Spark and clustering for anomaly detection, stop by and see Sean. He’ll answer all your questions on large-scale machine learning on Hadoop; using Spark, MLlib, and Mahout; and connecting R, SAS, et al to Hadoop for analytics. Read more.