• 10gen
  • DataStax, Inc.
  • Dell
  • Google
  • Lexis Nexis
  • Oracle
  • VMware
  • Percona

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the convention, contact Sharon Cordesse at scordesse@oreilly.com

Download the OSCON Data Sponsor/Exhibitor Prospectus

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

OSCON Bulletin

To stay abreast of convention news and announcements, please sign up for the OSCON email bulletin (login required)

Contact Us

View a complete list of OSCON contacts

The Hitchhiker’s Guide to A Kaggle Competition

Krishna Sankar (Blackarrow.tv)
Data: Roulette
Location: Oregon Ballroom 203
Average rating: ***..
(3.00, 3 ratings)

An introductory hands-on workshop, aimed at the Amateur Data Scientists among us, to the Heritage Health Prize competition. First, we will quickly look at the classes of algorithms & what they do through competition problems & datasets. Next we will dig deeper into one completion the Kaggle RTA Challenge(Ensemble/Random Forest). We will then dive into the Heritage Health Prize, work through the dataset & submit an entry!

Note: While there is not enough time for the participants to work through the different datasets, we will provide links to a hands-on tutorial which you’all can do after the workshop.


  • Algorithms for the Amateur Data Scientist
    • A look at the broader algorithms leading to Trees & Random Forests
  • The Art of Analytics Competitions – The Kaggle challenges
  • Anatomy of a competition – How the RTA was won
    • Predicting traffic at RTA using Ensemble /Random Forest Trees
  • Competition in flight – The HHP
    • Dataset Organization
    • Analytics Walkthrough
    • Submit our entry
  • Conclusion
Photo of Krishna Sankar

Krishna Sankar


Krishna Sankar is a Chief Data Scientist at blackarrow.tv, where he is focusing on enhancing use experience via inference, intelligence & interfaces. Earlier stints include Principal architect/Data Scientist/Tata America Intl, Director of Data Science/Bioinformatics startup & as a Distinguished Engineer/Cisco. He has been sparking at various conferences (OSCON,pycon,pydata) about predicting NFL [http://goo.gl/QCpaO8], Spark[http://goo.gl/E4kqMD], Data Science [http://goo.gl/9pyJMH], Machine Learning [http://goo.gl/SXF53n], Social media Analysis [http://goo.gl/D9YpVQ] as well as has been guest lecturing at the Naval Postgraduate School. His other passion is Lego Robotics – you will find him at the St.Louis FLL World Competition as Robots Design Judge.

Comments on this page are now closed.


Picture of Krishna Sankar
Krishna Sankar
07/27/2011 5:01pm PDT
There was a question from today’s workshop about good books on algorithms. The best list I have seen are answers at Quora and one at Linkedin:
Picture of Krishna Sankar
Krishna Sankar
07/25/2011 3:59pm PDT

I have downloaded a WIP snapshot at www.slideshare.net/ksankar/.... WOuld appreciate any comments. Beware – I have too many slides, it is intentional.