Advanced NFL stats released the play by play data for the 2002 to 2012 seasons. The play data is human generated. Doing any Data Science on it will be difficult until you transform it. After that you can merge it with other dataset to get even more insight. Ideally, you want an easily query-able dataset that you can use Hive, Pig or Impala to gain more insight.
I’ve blogged about some of the manual MapReduce jobs I’ve created based on the dataset. So far, I’ve correlated Quarterbacks and their most thrown to receivers. http://www.jesse-anderson.com/2013/01/nfl-play-by-play-analysis/
I am a Creative Engineer with many years of experience in creating products and helping companies improve their software engineering. I strive to provide developers with the resources to learn new technologies and improve their skillsets. I am a Curriculum Developer and Instructor at Cloudera. To help the local community, I volunteer my time as the President of the Northern Nevada Software Developers Group.
For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at (707) 827-7065 or email@example.com.
View a complete list of OSCON contacts