Using deep learning models to extract the most value from 360-degree images
Recent camera advances enabling automatic panorama generation have made 360-degree images ubiquitous in industries ranging from real estate to ecommerce and travel. These panoramic views enable an immersive experience that benefits consumers. But 360-degree images can create a challenge for businesses to direct viewers to the most important parts of the scene. Trulia’s parent company, Zillow Group, uses this technology to create 3-D home views that allow users to see a complete view of a room and find the perfect home. The wide field of view created by panoramas means that businesses must ensure viewers see the most engaging part of the image first. This need becomes paramount when panoramas need to be represented as static 2-D images. The key here is to identify a salient thumbnail specifically chosen to give the most informative view of each panorama to help drive engagement.
Shourabh Rawat explores how to use and train saliency score models, deep learning techniques, and algorithms to identify and extract the most visually informative and pleasing viewpoints to create this salient thumbnail. In order to compute a saliency score, Trulia relies on three different deep convolutional neural networks: the scene model helps capture the representativeness of a viewpoint to ensure the most relevant photos are chosen for a real estate listing (i.e., a kitchen or living room versus a blank wall or window); the attractiveness model penalizes low visual quality such as blurry or dark photos and rewards aesthetically pleasing photos with a high score and trains a deep learning model to label properties as either luxury or fixer-upper because home location and listing price tends to affect the photo quality as well; and the appropriateness model helps differentiate between relevant viewpoints like views of a bedroom from irrelevant views like walls or humans.
- A basic understanding of deep learning algorithms and models
What you'll learn
- Learn to create a saliency model that defines criteria for salient thumbnails, ensuring they are representative, attractive, and diverse
- Discover how to extract salient thumbnails while maintaining important aspects like specific field of view, 3-D orientation, aspect ratio, and viewport size; create an algorithm to rank all potential thumbnails extracted from the panorama based on a defined saliency criteria using scene, attractiveness, and appropriateness; and deploy these images within your organization’s practices
Shourabh Rawat is a manager of data science in the data engineering organization at Trulia (Zillow Group). He has over 5 years of industry experience working in AI, deep learning, computer vision and personalization, and deploying these systems to production at scale. Shourabh and his team focus on developing data science solutions to gain a better understanding of Trulia’s customers, specifically how they engage with content and property recommendations. Shourabh completed his master’s degree from Carnegie Mellon University where he did research on event detection in consumer videos, applying deep learning on multimodal (audio and images) data.
Leave a Comment or Question
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
Diversity and Inclusion Sponsor
Premier Exhibitor Plus
R & D and Innovation Track Sponsor
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
View a complete list of O'Reilly AI contacts