Skip to main content

Chicago Bars, Prisoner’s Dilemma, and Practical Models in Search

Chris Harland (Textio)
Data Science
Ballroom AB
Average rating: *****
(5.00, 13 ratings)

The runaway success of predictive modelling, most notably visible in Kaggle competitions, has seen data scientists of all approaches try their hand at building models with an ever growing corpus of data. However, making use of these models in an operational or influential fashion is difficult. It is often the case that important features of a model have difficult to interpret physical meanings or, more commonly, are not correlated with any change a business can make to its product or service. Without a strong link between model features and business actions predictive models can look great but fall short of delivering.

This talk will provide a sharp contrast between the “textbook best” and the “most successful” model in a fun and non-traditional (to data science anyway) business setting, that of North Side Chicago bars. I will describe the problems facing these businesses, the data science used to solve their problems (with particular attention paid to building the most accurate model vs. the model that makes the most money), and relate the collective success of each competitive bar to the classic “prisoner’s dilemma.” From this base, I will build into a much larger problem facing internet search, that of understanding user intent and satisfaction. I will again provide a short cast study in Bing search drawing from comparisons built previously in the Chicago bar examples.

As a physicist turned data scientist I have found the lesson of balancing the trade-off between accuracy and performance a crucial one in moving out of tutorials, books, and examples and into real data and problems where actionable insights are paramount. This talk will provide a balance of technical details in modeling with a broader appreciation for the role of modeling in serving results to clients and partners.

Outline:

  • Is the most accurate model always the best one?
  • Why it’s tough to be a bar owner in Chicago (how data science can help out a bar)
  • A rising tide lifts all boats (how cooperating with your competition is good for you)
  • Collaborative filtering for Corona (building a successful model)
  • What is success in search?
  • Predicting user intent (defining a data science problem in Bing search)
  • What do you mean when you say “Tom Cruise”? (using data science to assign correct search intent)
Photo of Chris Harland

Chris Harland

Director of Data Engineering, Textio

Chris Harland is a Data Scientist at Microsoft working on problems in Bing search, Windows, and MSN. He holds a PhD in Physics from the University of Oregon and has worked in a wide variety of fields spanning elementary science education, cutting edge biophysical research, and recommendation/personalization engines.

Chris came to Microsoft and data science by way of the University of Chicago where, after completing a post-doc, he founded a data science consulting start-up where he gained a large array of data science skills. Chris enjoys all things data science from blogging and tutorials to operational models that impact product users every day.