Cross-study learning for generalist and specialist predictions

07/24/2020
by   Boyu Ren, et al.
0

Jointly using data from multiple similar sources for the training of prediction models is increasingly becoming an important task in many fields of science. In this paper, we propose a framework for generalist and specialist predictions that leverages multiple datasets, with potential heterogenity in the relationships between predictors and outcomes. Our framework uses ensembling with stacking, and includes three major components: 1) training of the ensemble members using one or more datasets, 2) a no-data-reuse technique for stacking weights estimation and 3) task-specific utility functions. We prove that under certain regularity conditions, our framework produces a stacked prediction function with oracle property. We also provide analytically the conditions under which the proposed no-data-reuse technique will increase the prediction accuracy of the stacked prediction function compared to using the full data. We perform a simulation study to numerically verify and illustrate these results and apply our framework to predicting mortality based on a collection of variables including long-term exposure to common air pollutants.

READ FULL TEXT
research
09/19/2021

Optimal Ensemble Construction for Multi-Study Prediction with Applications to COVID-19 Excess Mortality Estimation

It is increasingly common to encounter prediction tasks in the biomedica...
research
06/11/2018

Aggregating Predictions on Multiple Non-disclosed Datasets using Conformal Prediction

Conformal Prediction is a machine learning methodology that produces val...
research
10/16/2016

Dynamic Stacked Generalization for Node Classification on Networks

We propose a novel stacked generalization (stacking) method as a dynamic...
research
05/01/2023

Cross-Institutional Transfer Learning for Educational Models: Implications for Model Performance, Fairness, and Equity

Modern machine learning increasingly supports paradigms that are multi-i...
research
10/17/2021

TIP: Task-Informed Motion Prediction for Intelligent Systems

Motion prediction is important for intelligent driving systems, providin...
research
01/29/2019

Simultaneous prediction of multiple outcomes using revised stacking algorithms

Motivation: HIV is difficult to treat because its virus mutates at a hig...

Please sign up or login with your details

Forgot password? Click here to reset