Merging versus Ensembling in Multi-Study Machine Learning: Theoretical Insight from Random Effects

05/17/2019
by   Zoe Guan, et al.
3

A critical decision point when training predictors using multiple studies is whether these studies should be combined or treated separately. We compare two multi-study learning approaches in the presence of potential heterogeneity in predictor-outcome relationships across datasets. We consider 1) merging all of the datasets and training a single learner, and 2) cross-study learning, which involves training a separate learner on each dataset and combining the resulting predictions. In a linear regression setting, we show analytically and confirm via simulation that merging yields lower prediction error than cross-study learning when the predictor-outcome relationships are relatively homogeneous across studies. However, as heterogeneity increases, there exists a transition point beyond which cross-study learning outperforms merging. We provide analytic expressions for the transition point in various scenarios and study asymptotic properties.

READ FULL TEXT

page 11

page 12

page 25

page 26

page 27

research
07/11/2022

Multi-Study Boosting: Theoretical Considerations for Merging vs. Ensembling

Cross-study replicability is a powerful model evaluation criterion that ...
research
08/17/2017

Extensions of Morse-Smale Regression with Application to Actuarial Science

The problem of subgroups is ubiquitous in scientific research (ex. disea...
research
03/10/2020

Pursuing Sources of Heterogeneity in Modeling Clustered Population

Researchers often have to deal with heterogeneous population with mixed ...
research
06/09/2023

Revisiting Permutation Symmetry for Merging Models between Different Datasets

Model merging is a new approach to creating a new model by combining the...
research
06/05/2017

ToPs: Ensemble Learning with Trees of Predictors

We present a new approach to ensemble learning. Our approach constructs ...
research
06/06/2018

Adversarial Regression with Multiple Learners

Despite the considerable success enjoyed by machine learning techniques ...

Please sign up or login with your details

Forgot password? Click here to reset