Sep 23–26, 2019
Please log in

Search logs + machine learning = autotagged inventory

John Berryman (Eventbrite)
4:35pm5:15pm Wednesday, September 25, 2019
Location: 3B - Expo Hall
Average rating: ****.
(4.75, 4 ratings)

Who is this presentation for?

  • Data-driven application engineers, search technologists, taxonomists, ecommerce folks

Level

Intermediate

Description

For ecommerce applications, matching users with the items they want is the name of the game. If they can’t find what they want, then how can they buy anything? Typically, this functionality is provided through the search and browse experience. Search allows users to type in text and match against the text of the items in the inventory. Browse allows users to select filters and slice and dice the inventory down to the subset they’re interested in. But with the shift toward mobile devices, no one wants to type anymore—thus browse is becoming dominant in the ecommerce experience.

But there’s a problem if your inventory isn’t categorized. Perhaps your inventory is user generated or generated by external providers who don’t tag and categorize the inventory. No categories and no tags means no browse experience and missed sales. You could hire an army of taxonomists and curators to tag items, but training and curation will be expensive. You can demand that your providers tag their items and adhere to your taxonomy—but providers will buck this new requirement unless they see obvious and immediate benefit. Worse, providers might use tags to game the system—artificially placing themselves in the wrong category to drive more sales. Worst of all, creating the right taxonomy is hard. You have to structure a taxonomy to realistically represent how your customers think about the inventory.

Eventbrite is investigating a tantalizing alternative: using a combination of customer interactions and machine learning to automatically tag and categorize its inventory. As customers interact with the platform—as they search for events and click on and purchase events that interest them—Eventbrite implicitly gathers information about how its users think about its inventory. Search text effectively acts like a tag, and a click on an event card is a vote that the clicked event is representative of that tag. Eventbrite uses this stream of information as training data for a machine learning classification model, and as Eventbrite receives new inventory, it can automatically tag it with the text that customers will likely use when searching for it. This makes it possible to better understand the inventory, supply and demand, and most importantly this allows Eventbrite to build the browse experience that customers demand.

John Berryman takes a deep dive into the problem space and Eventbrite’s approach. He explores how the company gathered training data from its search and click logs, and how it built and refined the model. You’ll see the output of the model and both the positive results of Eventbrite’s work, as well as the work left to be done. You’ll leave with some new ideas to take back to your business.

Prerequisite knowledge

  • A basic understanding of machine learning involving text manipulation, classification algorithms, and neural networks

What you'll learn

  • Gain a clever technique for generating tags for products based on the search behavior of customers
Photo of John Berryman

John Berryman

Eventbrite

John Berryman started out in the field of aerospace engineering, but soon found that he was more interested in math and software than in satellites and aircraft. He made the leap into software development, specializing in search and recommendation technologies. John’s a senior software engineer at Eventbrite, where he helps build Eventbrite’s event discovery platform. He also recently coauthored a tech book, Relevant Search, (Manning). The proceeds from the book have mostly paid for the coffee consumed while writing it.

  • Cloudera
  • O'Reilly
  • Google Cloud
  • IBM
  • Cisco
  • Dataiku
  • Intel
  • Io-Tahoe
  • MemSQL
  • Microsoft Azure
  • Oracle Cloud Infrastructure
  • SAS
  • Arcadia Data
  • BMC Software
  • Hazelcast
  • SAP
  • Amazon Web Services
  • Anaconda
  • Esri
  • Infoworks.io, Inc.
  • Kyligence
  • Pitney Bowes
  • Talend
  • Google Cloud
  • Confluent
  • DataStax
  • Dremio
  • Immuta
  • Impetus Technologies Inc.
  • Keyence
  • Kyvos Insights
  • StreamSets
  • Striim
  • Syncsort
  • SK holdings C&C

    Contact us

    confreg@oreilly.com

    For conference registration information and customer service

    partners@oreilly.com

    For more information on community discounts and trade opportunities with O’Reilly conferences

    strataconf@oreilly.com

    For information on exhibiting or sponsoring a conference

    pr@oreilly.com

    For media/analyst press inquires