For information on exhibition and sponsorship opportunities at the convention, contact Sharon Cordesse at email@example.com
Download the OSCON Data Sponsor/Exhibitor Prospectus
For information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com
For media-related inquiries, contact Maureen Jennings at firstname.lastname@example.org
To stay abreast of convention news and announcements, please sign up for the OSCON email bulletin (login required)
View a complete list of OSCON contacts
Attendee prerequisites for this tutorial are listed below.
Mahout is an open source machine learning library from Apache. At the present stage of development, it is evolving with a focus on collaborative filtering/recommendation engines, clustering, and classification.
There is no user interface, or a pre-packaged distributable server or installer. It is, at best, a framework of tools intend to be used and adapted by developers. The algorithms in this “suite” can be used in applications ranging from recommendation engines for movie websites to designing early warning systems in credit risk engines supporting the cards industry out there.
This tutorial aims at helping you set up Mahout to run on a Hadoop setup. The instructor will walk you through the basic idea behind each of the algorithms. Having done that, we’ll take a look at how it can be run on some of the large-sized datasets and how it can be used to solve real world problems.
If your site or smartphone app or viral facebook app collects data which you really want to use a lot more productively, this session is for you!
Instructions for setting up Mahout
First, subscribe to mahout-oscon googlegroup for updates, announcements and for discussing issues with setting up mahout for the tutorial.
Platforms supported by Mahout
Setting up instructions
If everything went fine, you will have a compiled library of mahout on your laptop.
To test if everything has succeeded, run the following command to test your setup.
If you face trouble compiling the library, shoot an email to mahout-oscon googlegroup. We will try to help you setup the library prior to coming for the tutorial.
QUESTIONS for the speaker?: Use the “Leave a Comment or Question” section at the bottom to address them.
Robin is a Committer at the Apache Software Foundation where he works with the Mahout Machine Learning community. He is also a co-author of “Mahout in Action” by Manning Publications, a book on how Mahout is used to perform Machine learning on Terabytes of data with ease.
He used to be a Tech Lead on the ML infrastructure for Minekey Inc, a valley based startup which focussing on recommendations and behavioral targeting for publisher content. He was introduced to the newly born Mahout community through the Google Summer of Code program while he was a dual-degree student at IIT Kharagpur. Since then, he has been trying to model machine learning algorithms in to the Map/Reduce format and have successfully merged his Complementary Naive Bayes and Frequent Pattern Mining implementations with the Mahout code base. He is currently working as a Software Engineer at Google, Bangalore. He finds time from work to contribute actively to the Mahout community.
Ted Dunning has been involved with a number of startups—the latest is MapR Technologies, where he is chief application architect working on advanced Hadoop-related technologies. Ted is also a PMC member for the Apache Zookeeper and Mahout projects and contributed to the Mahout clustering, classification, and matrix decomposition algorithms. He was the chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems and built fraud-detection systems for ID Analytics. Opinionated about software and data-mining and passionate about open source, he is an active participant of Hadoop and related communities and loves helping projects get going with new technologies.
Comments on this page are now closed.