Presented By O’Reilly and Cloudera

San Francisco • London • New York

Make Data Work

September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

When Tiramisu meets online fashion retail

Patty Ryan (Microsoft), CY Yam (Microsoft), Elena Terenzi (Microsoft)

4:35pm–5:15pm Wednesday, 09/12/2018

Data science and machine learning
Location: 1A 15/16 Level: Intermediate

Secondary topics: Deep Learning, Media, Marketing, Advertising, Retail and e-commerce

Average rating:

(5.00, 1 rating)

Download slides (PPTX)

Who is this presentation for?

Data scientists

Prerequisite knowledge

A general understanding of applied machine learning

What you'll learn

Explore Tiramisu, a deep learning-based tool for image segmentation and background removal
Learn how to apply both tradition ML and deep learning to mitigate issues, especially related to human error, associated with maintaining very large catalogue for online retail

Description

Large online fashion retailers must efficiently maintain catalogues of millions of items. Due to human error, it’s not unusual that some items have duplicate entries. Since manually trawling such a large catalogue is next to impossible, how can you find these entries?

You might take a snapshot of a newly arrived item with your phone and have an algorithm automatically check if such an item is already registered, based on its visual appearance. However, when applying content-based image retrieval, it’s highly likely that the performance will be hindered by the difference of the visual content in the images, such as the busy background of a mobile image versus a clean studio image, not to mention inconsistent folding or creases, lighting, scale and point-of-view angle. To increase the success rate, it’s prudent to remove the background of the query image before applying any retrieval algorithms.

Patty Ryan, CY Yam, and Elena Terenzi explain how they developed a specialized segmentation model for background removal or garment (foreground) segmentation using one of the most recent deep learning architectures, Tiramisu. The solution achieved a remarkable segmentation accuracy of 94% with 200 training images and has been proved to significantly improve content-based
image retrieval performance.

Patty, CY, and Elena begin by discussing GrabCut, a very successful foreground segmentation method, and explain how it is being used to create labeled data. They then offer an overview of their deep learning-based specialized segmentation tool Tiramisu and show where the model performs well and where its performance is less satisfactory. Patty, CY, and Elena conclude with a demonstration of how this tool can be applied to help to prevent the issue of duplicate entries in a very large online fashion retailer catalogue.

Patty Ryan

Microsoft

Patty Ryan is an applied data scientist at Microsoft, where she codes with the company’s partners and customers to tackle tough problems using machine learning approaches with sensor, text, and vision data. She’s a graduate of the University of Michigan.

CY Yam

Microsoft

CY Yam is a data scientist at Microsoft, where she applies machine learning techniques to solving various problems in daily life. Previously, CY invented new ways to recognize people by the way they move.

Website

Elena Terenzi

Microsoft

Elena Terenzi is a software development engineer at Microsoft, where she brings business intelligence solutions to Microsoft Enterprise customers and advocates for business analytics and big data solutions for the manufacturing sector in Western Europe, such as helping big automotive customers implement telemetry analytics solutions with IoT flavor in their enterprises. She started her career with data as a database administrator and data analyst for an investment bank in Italy. Elena holds a master’s degree in AI and NLP from the University of Illinois at Chicago.

Presented by

Elite Sponsors

Strategic Sponsors

Zettabyte Sponsors

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com