Presented By O'Reilly and Cloudera
Make Data Work
March 13–14, 2017: Training
March 14–16, 2017: Tutorials & Conference
San Jose, CA

Building a recommender from a big behavior graph over Cassandra

Gleicon Moraes (luc.id), Arthur Grava (Luizalabs)
2:40pm3:20pm Thursday, March 16, 2017
Secondary topics:  Architecture, Data Platform, ecommerce
Average rating: ****.
(4.00, 3 ratings)

Who is this presentation for?

  • Developers, data scientists, team leaders, and managers

Prerequisite knowledge

  • A basic knowledge of big data tools

What you'll learn

  • Learn how one company built high-performance APIs, extracted knowledge and value from user information, processed large-scale data with Hadoop, and stored large-scale data, providing a slow query time

Description

Recommender systems personalize the user experience, helping users find the product they need or desire. There are many options on the market that integrate user behavior and purchase history, connecting this information with suggestions, emails, and push notifications.

Developing your own recommender system is a good option to do fast and flexible testing and modify your results according to your business rules while also allowing a closer optimization for your ecommerce system. But how do you build a robust architecture that is linearly scalable and capable of serving millions of recommendations daily while also keeping track of its performance?

Gleicon Moraes and Arthur Grava share war stories about developing and deploying a cloud-based large-scale recommender system for a top-three Brazilian ecommerce company. The system, which uses Cassandra, TitanDB, Gremlin, Hadoop, Java, and Python to calculate recommendations with a low computational cost without performing matrix operations, led to a more than 15% increase in sales. Gleicon and Arthur cover the evolution of the architecture and how they implemented their recommender algorithms using a graph database and discuss the system’s incremental growth and the decisions to simplify the architecture and the recommendations processing.

Photo of Gleicon Moraes

Gleicon Moraes

luc.id

Gleicon Moraes is director of data engineering at luc.id. Gleicon loves infrastructure for data, moving large volumes through distributed messaging systems, and databases. He uses Python, Go, and Erlang and focuses on distributed systems, nonrelational databases, and OSS.

Photo of Arthur Grava

Arthur Grava

Luizalabs

Arthur Grava is the big data team leader at Luizalabs, where he works closely with the company’s recommender system and focuses on machine learning with Hadoop, Java, Cassandra, and Python. Arthur holds a master’s degree in recommender systems from USP.