Skip to main content

Data Science of Love

Vaclav Petricek (eHarmony)
Data Science Beekman Parlor - Sutton North
Average rating: ****.
(4.45, 11 ratings)
Slides:   1-PDF 

Matchmaking is an age-old concept that has been revolutionalized with the
advent of Internet. Suddenly the pool of potential partners that one can
plausibly consider has exploded. Thanks to the move of courtship online we have
collected unprecedented amounts of data on romantic interactions.

If you are looking for love you may want to take advatage of this accumulated
knowledge to give yourself a leg up. However making causal inferences aka
“Dating advice” can be problematic due to various sample biases. I will instead
show how this data can be leveraged to build a matchmaking system that reduces
data overload and improves your chances of a happy marriage.

In this presentation I will focus specifically at solving three problems with data:

  1. Compatibility: matching for the long term based on psychological traits
  2. Affinity: modeling the immediate attraction
  3. Distribution: who to introduce to who and when

I will show how hadoop, vowpal wabbit, gbms, and graph optimization can be used
together to solve the matchmaking problem as well as related problems in
advertising and constrained recommendation. The presentation will aslo
highlight the architecture of eHarmony’s matchmaking system that every night
needs to choose the best set of introductions from about 2^10^12 possibilities.

Photo of Vaclav Petricek

Vaclav Petricek


Vaclav Petricek is a Principal Data Scientist at Santa Monica based eHarmony where he is responsible for optimization, and machine learning applications for eHarmony core matchmaking algorithms. He also runs a series of invited ML talks at eHarmony, part of the Los Angeles Machine Learning Meetup. Prior to eHarmony, Vaclav was visiting Researcher at University College London where his research spanned recommender systems, social networks, web structure and online auctions. Prior to that he has worked at several Czech startups as a developer and sysadmin. He earned his PhD in Computer Science from Charles University in Prague as well as his Masters in Distributed Systems.


Sponsorship Opportunities

For exhibition and sponsorship opportunities, contact Susan Stewart at

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences email mediapartners

Press & Media

For media-related inquiries, contact Maureen Jennings at

Contact Us

View a complete list of Strata + Hadoop World 2013 contacts