Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Improving user-merchant propensity modeling using neural collaborative filtering and wide and deep models on Spark BigDL at scale

Sergey Ermolin (Intel), Suqiang Song (Mastercard)
5:10pm5:50pm Wednesday, March 7, 2018

Who is this presentation for?

  • Data systems architects, project managers, and solutions architects

Prerequisite knowledge

  • A basic understanding of neural networks and machine learning

What you'll learn

  • Learn how to use neural collaborative filtering and wide and deep models on BigDL to predict a user’s probability of shopping at a particular offer merchant during a campaign period


Constructing marketing campaigns, targeting them to specific retail customers, and evaluating campaign effectiveness is a perennial problem for merchants and data processors. One of the key parameters of such campaigns is a user’s propensity to shop at a specific merchant in the future. A traditional machine learning methods of solving the aforementioned problem can be broken down into four steps:

  1. Determine if a user shopped at the offer merchant during the training period (a binary label)
  2. Generate a set of features based on user’s shopping behavior before the training period
  3. Create a model (e.g., regression) to fit the label to each user-merchant (offer) pair using the set of features
  4. Use the regression model to predict future shopper-merchant behavior

This traditional approach requires extensive feature engineering and user-behavior analysis during a model’s creation and tuning. As such, it often involves creating handcrafted features and demands an intimate knowledge of the dataset.

Sergey Ermolin and Suqiang Song demonstrate how to use Spark BigDL wide and deep and neural collaborative filtering (NCF) algorithms to predict a user’s probability of shopping at a particular offer merchant during a campaign period. Along the way, they compare the deep learning results with those obtained by MLlib’s alternating least squares (ALS) approach. The proposed approaches reduce feature engineering workload and perform better than traditional feature-based ALS as measured by precision and recall metrics. However, these convolutional networks require significantly larger computational resources than traditional approaches, hence the logical requirement for a distributed compute infrastructure such as Apache Spark and a scalable deep learning framework such as BigDL.

Sergey and Suqiang share work based on a real-life dataset that covers 12 months of data, between 1 and 10 million distinct qualified consumers, between 2 and 20 billion distinct known transactions, and between 50 and 200 target merchants (offers or campaigns) for benchmarks. Using this dataset as an example, they offer a detailed overview of the merchant-user relationship, share an in-depth outline of the deep learning algorithms they used, and discuss compute resources required.

Photo of Sergey Ermolin

Sergey Ermolin


Sergey Ermolin is a software solutions architect for deep learning, Spark analytics, and big data technologies at Intel. A Silicon Valley veteran with a passion for machine learning and artificial intelligence, Sergey has been interested in neural networks since 1996, when he used them to predict aging behavior of quartz crystals and cesium atomic clocks made by Hewlett-Packard. Sergey holds an MSEE and a certificate in mining massive datasets from Stanford and BS degrees in both physics and mechanical engineering from California State University, Sacramento.

Photo of Suqiang Song

Suqiang Song


Suqiang Song is director and chapter leader at Mastercard, where directly oversees a team embedded within the data engineering and AI tribe. Suqiang blends deep business and technical expertise with a passion for coaching people, helping them grow and develop in their area of expertise and ensuring alignment on the “how” of the work they perform in squads.