Predicting Criteo’s internet traffic load using Bayesian structural time series models

Hamlet Jesse Medina Ruiz (Criteo)

4:35pm–5:15pm Wednesday, September 25, 2019

Location: 1A 12/14

Data Science, Machine Learning, & AI

Secondary topics: Media and Advertising, Temporal data and time-series analytics

Average rating:

(4.00, 7 ratings)

Level

Intermediate

Criteo connects 1.5 billion active shoppers with the things they need and love. Its technology takes an algorithmic approach to predict which user it shows an ad to, when, and for what products. Criteo’s infrastructure evolution is driven by its traffic forecast. Its infrastructure provides capacity and connectivity to host the Criteo platform and applications. Located in six different countries across the Americas, Europe, and Asia, its footprint covers nine data centers, two high-performance computing (HPC) clusters, more than 35K physical servers, and more than 5M queries per second (QPS) on peak hours.

Due to its critical importance, one of principal tasks of the product data science team is to build machine learning models to forecast traffic demand across services and data centers to make good investment decisions to scale the company’s infrastructure. This allows Criteo to accurately build predictions of how many machines any service will need in the future with stunning accuracy. Predicting capacity is especially useful to allocate hardware needs for periods when the traffic load is really high, for example, during Black Friday, Cyber Monday, or Christmas sales in the Americas and Europe.

Hamlet Jesse Medina Ruiz explains how to forecast Criteo’s traffic load using Bayesian dynamic time series models. He details the general Bayesian framework, its advantages and limitations, and alternatives to solve the problem.

To forecast the traffic load, the company makes use of Bayesian state space models to forecast daily traffic load several months in advance. The statistical Bayesian framework, in contrast to classical econometric or classical time series models, allows you to infer time-varying components present in the time series, like local trends, local seasonalities, capture especial events and holidays in a hierarchical way, or simply induce sparsity in the model, etc. The Bayesian treatment also allows you to include domain knowledge in the form of prior distributions in a flexible way. This modeling approach has proven to be very valuable for Criteo when there isn’t enough data available to train its models. Over the last two years, these extreme periods have been predicted six months in advance very well by its models with an error lower than 6%.

Prerequisite knowledge

A basic understanding of machine learning and time series concepts

What you'll learn

Learn how to analyze time series using Bayesian modeling, in particular how to make a good forecast by including uncertainty in your estimates

Hamlet Jesse Medina Ruiz

Criteo

Hamlet Jesse Medina Ruiz is a senior data scientist at Criteo. Previously, he was a control system engineer for Petróleos de Venezuela. Hamlet finished in the top ranking in multiple data science competitions, including 4th place on predicting return volatility on the New York Stock Exchange hosted by Collège de France and CFM in 2018 and 25th place on predicting stock returns hosted by G-Research in 2018. Hamlet holds a two master degrees on mathematics and machine learning from Pierre and Marie Curie University, and a PhD in applied mathematics from Paris-Sud University in France, where he focused on statistical signal processing and machine learning.