Skip to main content

Machine Learning Applications: Recommendation Engines Using Multiple Behavior Sources

Ted Dunning (MapR)
Hardcore Data Science Gramercy Suite
Average rating: ****.
(4.25, 12 ratings)

Machine learning constructs such as Recommendation engines often take a simplistic approach to data modeling: a single kind of user interaction with a single kind of item is used to suggest the same kind of interaction with the same kind of item. In practice however, this approach is flawed for several reasons. First, multiple kinds of interactions with multiple kinds of items are typically available for training the recommendation engine to make suggestions. Second, recommendation is better viewed as a ranking problem rather than a regression problem. Finally, practical recommendation systems should be constantly self-training as today’s recommendations and selections can be used to train tomorrow’s recommender.

This session will shed light on a practical recommendation architecture and implementation style that addresses all of the above issues and which is considerably easier to implement and deploy than conventional approaches. Several of the techniques that I will describe have never (to my knowledge) appeared in the research literature. The session will also describe how the self-feeding and data-hungry nature of recommendation algorithms make supposedly secondary considerations like result order dithering more important than algorithm choice.

Photo of Ted Dunning

Ted Dunning

MapR

Ted Dunning is the chief technology officer at MapR. He’s also a board member for the Apache Software Foundation; a PMC member and committer of the Apache Mahout, Apache Zookeeper, and Apache Drill projects; and a mentor for various incubator projects. Ted has years of experience with machine learning and other big data solutions across a range of sectors. He’s contributed to clustering, classification, and matrix decomposition algorithms in Mahout and to the new Mahout Math library and designed the t-digest algorithm used in several open source projects and by a variety of companies. Previously, Ted was chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems and built fraud-detection systems for ID Analytics (LifeLock). Ted has coauthored a number of books on big data topics, including several published by O’Reilly related to machine learning, and has 24 issued patents to date plus a dozen pending. He holds a PhD in computing science from the University of Sheffield. When he’s not doing data science, he plays guitar and mandolin. He also bought the beer at the first Hadoop user group meeting.

Comments on this page are now closed.

Comments

10/29/2013 7:11pm EDT

They will be on the strata web site shortly. Also see

http://slidesha.re/1f1NE0W

Picture of Benjamin Bengfort
Benjamin Bengfort
10/29/2013 11:33am EDT

To ask the question that commonly gets asked: is there a place we could get the slides?

Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences email mediapartners
@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata + Hadoop World 2013 contacts