Building a Machine Learning Lab that Scales in Your Garage

Citizen Science
Location: D136
Average rating: *....
(1.25, 4 ratings)

This presentation provides tools and techniques for doing “data-intensive” science in your own home – much as the gentlemen scientists of the early modern era – using open source software and repurposed commodity hardware. Based on the author’s own experience as a non-specialist practitioner, the presentation gives new data scientists a quick overview of Linux, Hadoop, Mahout, and related software and guidelines for setting up a test and dev environment for tackling meaningful problems in machine learning.

Photo of Vin Sharma

Vin Sharma


Vin is responsible for enterprise-focused marketing strategy of open source software on Intel architectures. In this role, Vin drives awareness of Intel technologies into the open source ecosystem, and collaborates with OEM and OSV partners to deliver open source solutions to enterprises worldwide. Before Intel, Vin worked at HP for 15 years, most recently as the strategic product planner for open source and Linux on HP ProLiant servers. Vin has an academic background in history of technology and electrical engineering.

Comments on this page are now closed.


Picture of Shirley Bailes
Shirley Bailes
07/28/2011 6:36am PDT

@William, @Casper: we’ve just discovered that the speaker had a family emergency.

Casper Bodewitz
07/28/2011 4:49am PDT

Exactly, and then they won’t let you in the only other interesting session because “it’s full”.

William McVey
07/28/2011 4:37am PDT

Seriously, who is late to their own presentation without notifying conference staff?