Presented By O'Reilly and Cloudera
Make Data Work
December 1–3, 2015 • Singapore

Building South East Asia's largest E-commerce Recommender

Kai Xin Thia (Lazada)
11:00am–11:40am Wednesday, 12/02/2015
Data Science and Advanced Analytics
Location: 321-322 Level: Intermediate
Tags: featured
Average rating: ****.
(4.45, 11 ratings)

Prerequisite Knowledge

* Interacted with some form of recommender products before (Google, Facebook, Twitter, Amazon, Netflix, Pandora etc). * High school level mathematics. * Interest in learning more about large scale recommenders.


Recommenders are ubiquitous in the e-commerce space. You see them on websites under “People who have bought this also bought…”. You receive daily emails with titles like “Handpicked new products for you!” The unique thing about recommenders is that they do not have a clear “correct answer”. For example, recommending me another smartphone after I have just bought one is “correct” based on my purchase behavior but really a poor choice in terms of context. On the other hand, keeping me updated on the power bank I have in my wish-list and recommending me similar but cheaper alternatives might show several “wrong” items that I will never buy, but it might also help me discover new, better products than what I had in mind.

For this talk, I will share what my team at Lazada has learnt from building the largest e-commerce recommender in South East Asia:

  • How do we measure the performance of recommenders? Beyond the metrics that measure historical performance, how do we pick recommenders that improve user discovery and serendipity?
  • Why and how do we mix inputs from several recommenders? Especially in Lazada’s case – it is very complex and expensive to create a single recommender that will work across multiple countries and cultural preferences. Furthermore, each type of recommender comes with its own strengths and weaknesses; some might do well on new customers (cold start problem) while others perform better on customers with a good shopping history. We will explore the various types of models we built, the tradeoffs we considered and our how we optimized the model mix.
  • How do we scale recommenders? Apache Spark is powerful but without a strong underlying Hadoop infrastructure and tools like Kafka to handle clickstream data, it will be impossible to perform analytics on millions of products and users.
Photo of Kai Xin Thia

Kai Xin Thia


Kai Xin Thia is a data scientist at Lazada. He specializes in behavioral analytics and has an interest in large recommendation systems. He has been building behavioral models for three years and is in the top 1% on Kaggle, an international data science competition portal. Kai Xin is also the co-founder of DataScience SG (the largest data science community in Singapore), volunteer at DataKind SG (NGO that helps other NGOs through data science), and is an invited speaker/trainer at various data meetups in Singapore. He likes traveling and experiencing the diversity of the world.

Comments on this page are now closed.


Duy Ngan Le
12/01/2015 2:39am +08

I am very interested in your talk. Hope to have time to catchup with you after your presentation.