Mahout: Mammoth Scale Machine Learning

Robin Anil (Google)
Average rating: ***..
(3.62, 8 ratings)

Mahout is an open source machine learning library from Apache. At the moment it means collaborative filtering / recommender engines, clustering, and classification.

Mahout aims to be the machine learning tool of choice when the data to be processed is very large, perhaps far too large for a single machine. In its current incarnation, these scalable implementations are written in Java, and some portions are built upon Apache’s Hadoop distributed computation project.

Mahout does not provide a user interface, a pre-packaged server, or installer. It is a framework of tools intended to be used and adapted by developers.

Photo of Robin Anil

Robin Anil


Robin is a Committer at the Apache Software Foundation where he works with the Mahout Machine Learning community. He is also a co-author of “Mahout in Action” by Manning Publications, a book on how Mahout is used to perform Machine learning on Terabytes of data with ease.

He used to be a Tech Lead on the ML infrastructure for Minekey Inc, a valley based startup which focussing on recommendations and behavioral targeting for publisher content. He was introduced to the newly born Mahout community through the Google Summer of Code program while he was a dual-degree student at IIT Kharagpur. Since then, he has been trying to model machine learning algorithms in to the Map/Reduce format and have successfully merged his Complementary Naive Bayes and Frequent Pattern Mining implementations with the Mahout code base. He is currently working as a Software Engineer at Google, Bangalore. He finds time from work to contribute actively to the Mahout community.

Comments on this page are now closed.


Michael Stack
07/26/2010 5:05am PDT

I really liked this talk. I’m anxious to start using Mahout, and this talk got me excited about it. Nice job!

Picture of Edd Wilder-James
Edd Wilder-James
07/07/2010 7:18am PDT

@Bill, this presentation forms part of the regular conference sessions.

Picture of Bill Binkley
Bill Binkley
07/07/2010 7:15am PDT

Is this included in the conference cost or is there an additional fee?

Bill Binkley

Agriculture Scales Expert

  • Intel
  • Microsoft
  • Google
  • Facebook
  • Rackspace Hosting
  • (mt) Media Temple, Inc.
  • ActiveState
  • CommonPlaces
  • DB Relay
  • FireHost
  • GoDaddy
  • HP
  • HTSQL by Prometheus Research
  • Impetus Technologies Inc.
  • Infobright, Inc
  • JasperSoft
  • Kaltura
  • Marvell
  • Mashery
  • NorthScale, Inc.
  • Open Invention Network
  • OpSource
  • Oracle
  • Parallels
  • PayPal
  • Percona
  • Qualcomm Innovation Center, Inc.
  • Rhomobile
  • Schooner Information Technology
  • Silicon Mechanics
  • SourceGear
  • Symbian
  • VoltDB
  • WSO2
  • Linux Pro Magazine

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at

Download the OSCON Sponsor/Exhibitor Prospectus

Media Partner Opportunities

Download the Media & Promotional Partner Brochure (PDF) for information on trade opportunities with O'Reilly conferences or contact mediapartners@

Press and Media

For media-related inquiries, contact Maureen Jennings at

OSCON Newsletter

To stay abreast of conference news and to receive email notification when registration opens, please sign up for the OSCON Newsletter (login required)

OSCON 2.0 Ideas

Have an idea for OSCON to share?

Contact Us

View a complete list of OSCON contacts