Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA
Please log in

Using deep learning to automatically rank millions of hotel images

Christopher Lennan (idealo.de)
4:40pm5:20pm Thursday, March 28, 2019
Secondary topics:  Deep Learning, Retail and e-commerce
Average rating: ****.
(4.00, 1 rating)

Who is this presentation for?

  • Machine learning professionals

Level

Intermediate

Prerequisite knowledge

  • A basic understanding of supervised machine learning
  • Familiarity with neural networks (useful but not required)

What you'll learn

  • Explore the lifecycle of a large-scale deep learning project, from prototyping on a public dataset to fine-tuning on in-house labeled data to the deployment of the system in production and its business impact
  • Learn a state-of-the-art approach to train neural networks with ordered labels (the Earth Mover's Distance versus cross-entropy loss) and visualization techniques for CNNs

Description

Idealo.de has a dedicated service to provide hotel price comparisons. The company receives dozens of images for each hotel and faces the challenge of choosing the most “attractive” image for its offer comparison pages, as photos can be just as important for bookings as reviews. The millions of hotel offers mean that there are more than 100 million images that need an “attractiveness” assessment.

Idealo.de addressed this challenge by implementing an aesthetic and technical image quality classifier based on Google’s research paper “NIMA: Neural Image Assessment." NIMA consists of two convolutional neural networks (CNN) that aim to predict the aesthetic and technical quality of images, respectively. The models are trained via transfer learning, where ImageNet pretrained CNNs are fine-tuned for each quality classification task.

Christopher Lennan shares the training approach and peculiarities of the models (e.g., the Earth Mover’s Distance objective function) as well as major insights gained from each iteration, including the importance of collecting high-quality labeled data. Finally, he sheds light on what the trained models actually learned by visualizing the convolutional filter weights and output nodes of the trained models and illustrates how this helped idealo.de optimize the models.

Photo of Christopher Lennan

Christopher Lennan

idealo.de

Christopher Lennan is a senior data scientist at idealo.de, where he works on computer vision problems to improve the product search experience. In previous positions, he applied machine learning methods to fMRI and financial data. Christopher holds a master’s degree in statistics from Humboldt Universität Berlin.