Sep 9–12, 2019

Using deep learning models to extract the most value from 360-degree images

Shourabh Rawat (Trulia)
4:00pm4:40pm Thursday, September 12, 2019
Location: 230 A

Level

Intermediate

Recent camera advances enabling automatic panorama generation have made 360-degree images ubiquitous in industries ranging from real estate to ecommerce and travel. These panoramic views enable an immersive experience that benefits consumers. But 360-degree images can create a challenge for businesses to direct viewers to the most important parts of the scene. Trulia’s parent company, Zillow Group, uses this technology to create 3-D home views that allow users to see a complete view of a room and find the perfect home. The wide field of view created by panoramas means that businesses must ensure viewers see the most engaging part of the image first. This need becomes paramount when panoramas need to be represented as static 2-D images. The key here is to identify a salient thumbnail specifically chosen to give the most informative view of each panorama to help drive engagement.

Shourabh Rawat explores how to use and train saliency score models, deep learning techniques, and algorithms to identify and extract the most visually informative and pleasing viewpoints to create this salient thumbnail. In order to compute a saliency score, Trulia relies on three different deep convolutional neural networks: the scene model helps capture the representativeness of a viewpoint to ensure the most relevant photos are chosen for a real estate listing (i.e., a kitchen or living room versus a blank wall or window); the attractiveness model penalizes low visual quality such as blurry or dark photos and rewards aesthetically pleasing photos with a high score and trains a deep learning model to label properties as either luxury or fixer-upper because home location and listing price tends to affect the photo quality as well; and the appropriateness model helps differentiate between relevant viewpoints like views of a bedroom from irrelevant views like walls or humans.

Prerequisite knowledge

  • A basic understanding of deep learning algorithms and models

What you'll learn

  • Learn to create a saliency model that defines criteria for salient thumbnails, ensuring they are representative, attractive, and diverse
  • Discover how to extract salient thumbnails while maintaining important aspects like specific field of view, 3-D orientation, aspect ratio, and viewport size; create an algorithm to rank all potential thumbnails extracted from the panorama based on a defined saliency criteria using scene, attractiveness, and appropriateness; and deploy these images within your organization’s practices
Photo of Shourabh Rawat

Shourabh Rawat

Trulia

Shourabh Rawat is a manager of data science in the data engineering organization at Trulia (Zillow Group). He has over 5 years of industry experience working in AI, deep learning, computer vision and personalization, and deploying these systems to production at scale. Shourabh and his team focus on developing data science solutions to gain a better understanding of Trulia’s customers, specifically how they engage with content and property recommendations. Shourabh completed his master’s degree from Carnegie Mellon University where he did research on event detection in consumer videos, applying deep learning on multimodal (audio and images) data.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of O'Reilly AI contacts