San FranciscoLondon New York

Presented By
O’Reilly + Cloudera

Make Data Work

March 25-28, 2019
San Francisco, CA

Please log in

Add to Your Schedule

Using deep learning to automatically rank millions of hotel images

Christopher Lennan (idealo.de)

4:40pm–5:20pm Thursday, March 28, 2019

Data Science, Machine Learning & AI
Location: 2016

Secondary topics: Deep Learning, Retail and e-commerce

Average rating:

(4.00, 1 rating)

Who is this presentation for?

Machine learning professionals

Level

Intermediate

Prerequisite knowledge

A basic understanding of supervised machine learning
Familiarity with neural networks (useful but not required)

What you'll learn

Explore the lifecycle of a large-scale deep learning project, from prototyping on a public dataset to fine-tuning on in-house labeled data to the deployment of the system in production and its business impact
Learn a state-of-the-art approach to train neural networks with ordered labels (the Earth Mover's Distance versus cross-entropy loss) and visualization techniques for CNNs

Description

Idealo.de has a dedicated service to provide hotel price comparisons. The company receives dozens of images for each hotel and faces the challenge of choosing the most “attractive” image for its offer comparison pages, as photos can be just as important for bookings as reviews. The millions of hotel offers mean that there are more than 100 million images that need an “attractiveness” assessment.

Idealo.de addressed this challenge by implementing an aesthetic and technical image quality classifier based on Google’s research paper “NIMA: Neural Image Assessment." NIMA consists of two convolutional neural networks (CNN) that aim to predict the aesthetic and technical quality of images, respectively. The models are trained via transfer learning, where ImageNet pretrained CNNs are fine-tuned for each quality classification task.

Christopher Lennan shares the training approach and peculiarities of the models (e.g., the Earth Mover’s Distance objective function) as well as major insights gained from each iteration, including the importance of collecting high-quality labeled data. Finally, he sheds light on what the trained models actually learned by visualizing the convolutional filter weights and output nodes of the trained models and illustrates how this helped idealo.de optimize the models.

Christopher Lennan

idealo.de

Christopher Lennan is a senior data scientist at idealo.de, where he works on computer vision problems to improve the product search experience. In previous positions, he applied machine learning methods to fMRI and financial data. Christopher holds a master’s degree in statistics from Humboldt Universität Berlin.

Website

Presented by

Strategic Sponsors

Zettabyte Sponsor

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsor

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com