Doing Data Science On NFL Play by Play

Jesse Anderson (Smoking Hand)
Location: Portland 256 Level: Intermediate
Average rating: ***..
(3.50, 20 ratings)

Advanced NFL stats released the play by play data for the 2002 to 2012 seasons. The play data is human generated. Doing any Data Science on it will be difficult until you transform it. After that you can merge it with other dataset to get even more insight. Ideally, you want an easily query-able dataset that you can use Hive, Pig or Impala to gain more insight.

I’ve blogged about some of the manual MapReduce jobs I’ve created based on the dataset. So far, I’ve correlated Quarterbacks and their most thrown to receivers.

Photo of Jesse Anderson

Jesse Anderson

Smoking Hand

I am a Creative Engineer with many years of experience in creating products and helping companies improve their software engineering. I strive to provide developers with the resources to learn new technologies and improve their skillsets. I am a Curriculum Developer and Instructor at Cloudera. To help the local community, I volunteer my time as the President of the Northern Nevada Software Developers Group.


Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at (707) 827-7065 or

Contact Us

View a complete list of OSCON contacts