Customer Behavior Modeling at Scale

Data Science
Location: Room 1-6 Level: Intermediate
Average rating: ****.
(4.75, 4 ratings)

Nearest neighbor models are conceptually just about the simplest kind of model possible. The problem is that they generally aren’t feasible to apply. Or at least, they weren’t feasible until the advent of Big Data techniques. This talk will describe some of the techniques used in the knn project to reduce thousand-year computations to a few hours. The knn project uses the Mahout math library and Hadoop to speed up these enormous computations to the point that they can be usefully applied to real problems. These same techniques can also be used to do real-time model scoring.

This talk starts with a focus on financial applications, but it continues with applications in life sciences, genomics, web metrics and recommendations.

Photo of Ted Dunning

Ted Dunning

MapR, now part of HPE

Ted Dunning has been involved with a number of startups with the latest being MapR Technologies where he is Chief Application Architect working on advanced Hadoop-related technologies. He is also a PMC member for the Apache Zookeeper and Mahout projects. Opinionated about software and data-mining and passionate about open source, he is an active participant of Hadoop and related communities and loves helping projects get going with new technologies.


Sponsorship Opportunities

For information on exhibition and sponsorship opportunities, contact Susan Stewart at or +1 (707) 827-7148

Media Partner Opportunities

For information on trade opportunities contact Kathy Yu at mediapartners

Press and Media

For media-related inquiries, contact Maureen Jennings at

Contact Us

View a complete list of Strata contacts.