Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Applications of mixed effects random forests

Sourav Dey (Manifold)
11:00am11:40am Thursday, March 28, 2019
Average rating: ****.
(4.75, 4 ratings)

Who is this presentation for?

  • Data scientists

Level

Intermediate

Prerequisite knowledge

  • A basic understanding of data science and machine learning vocabulary and concepts

What you'll learn

  • Explore use cases of mixed effects random forests
  • Understand why MERF is more effective for clustered data than a vanilla random forest model
  • Discover the history of mixed effects modeling
  • Learn how to use our open source Python package

Description

Clustered data is all around us. The most common example is longitudinal clustering, where each individual instance of a phenomena you wish to model has multiple associated measurements (e.g., modeling math test scores as a function of sleep factors when you have multiple measurements per student). Another common example is clustering due to a categorical variable (e.g., clusters representing the specific math teacher of a group of students). Clustering can also be hierarchical (e.g., a student cluster contained within a teacher cluster, which is itself contained within a school cluster). When modeling clustered data, you must account for any idiosyncrasies and nonnegligible random effects by cluster.

The best way to attack this kind of data? Mixed effects models. Inspired by the models we have been building for clients, Manifold has developed mixed effects random forests (MERF)—an open source implementation package in Python.

Sourav Dey explains how the MERF model marries the world of classical mixed effect modeling with modern machine learning algorithms and shows how it can be extended to be used with other advanced modeling techniques like gradient boosting machines and deep learning. He also walks you through example use cases and demonstrates MERF performance on synthetic and real data.

Photo of Sourav Dey

Sourav Dey

Manifold

Sourav Dey is CTO at Manifold, an artificial intelligence engineering services firm with offices in Boston and Silicon Valley. Previously, Sourav led teams building data products across the technology stack, from smart thermostats and security cams at Google/Nest to power grid forecasting at AutoGrid to wireless communication chips at Qualcomm. He holds patents for his work, has been published in several IEEE journals, and has won numerous awards. He holds PhD, MS, and BS degrees in electrical engineering and computer science from MIT.

Comments on this page are now closed.

Comments

Picture of Sourav Dey
Sourav Dey | CTO
03/29/2019 6:25am PDT

Slides can be downloaded here: https://www.manifold.ai/2019strataSF

Faraz Ahmad | SOFTWARE ENGINEER
03/29/2019 6:08am PDT

Hi,
Thanks for the great talk. Can you please share your slides?