Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA
Please log in

Applications of mixed effects random forests

Sourav Dey (Manifold)
11:00am11:40am Thursday, March 28, 2019
Average rating: ****.
(4.75, 4 ratings)

Who is this presentation for?

  • Data scientists



Prerequisite knowledge

  • A basic understanding of data science and machine learning vocabulary and concepts

What you'll learn

  • Explore use cases of mixed effects random forests
  • Understand why MERF is more effective for clustered data than a vanilla random forest model
  • Discover the history of mixed effects modeling
  • Learn how to use our open source Python package


Clustered data is all around us. The most common example is longitudinal clustering, where each individual instance of a phenomena you wish to model has multiple associated measurements (e.g., modeling math test scores as a function of sleep factors when you have multiple measurements per student). Another common example is clustering due to a categorical variable (e.g., clusters representing the specific math teacher of a group of students). Clustering can also be hierarchical (e.g., a student cluster contained within a teacher cluster, which is itself contained within a school cluster). When modeling clustered data, you must account for any idiosyncrasies and nonnegligible random effects by cluster.

The best way to attack this kind of data? Mixed effects models. Inspired by the models we have been building for clients, Manifold has developed mixed effects random forests (MERF)—an open source implementation package in Python.

Sourav Dey explains how the MERF model marries the world of classical mixed effect modeling with modern machine learning algorithms and shows how it can be extended to be used with other advanced modeling techniques like gradient boosting machines and deep learning. He also walks you through example use cases and demonstrates MERF performance on synthetic and real data.

Photo of Sourav Dey

Sourav Dey


Sourav Dey is CTO at Manifold, an artificial intelligence engineering services firm with offices in Boston and Silicon Valley. Sourav leads the engineering team focusing on work across client projects, developing platform technologies to make Manifold ML engineers more efficient, and communicating to business stakeholders. Prior to Manifold, Sourav led teams building data products across the technology stack, from smart thermostats and security cams at Google-Nest to wireless communication at Qualcomm. Sourav’s career has always been at the intersection of math and computer science — a PhD from MIT in signal processing and bachelors degrees in Math and CS from MIT.

Comments on this page are now closed.


Picture of Sourav Dey
Sourav Dey | CTO
03/29/2019 6:25am PDT

Slides can be downloaded here:

03/29/2019 6:08am PDT

Thanks for the great talk. Can you please share your slides?