Practical considerations for specifying a super learner

04/13/2022
by   Rachael V. Phillips, et al.
0

Common tasks encountered in epidemiology, including disease incidence estimation and causal inference, rely on predictive modeling. Constructing a predictive model can be thought of as learning a prediction function, i.e., a function that takes as input covariate data and outputs a predicted value. Many strategies for learning these functions from data are available, from parametric regressions to machine learning algorithms. It can be challenging to choose an approach, as it is impossible to know in advance which one is the most suitable for a particular dataset and prediction task at hand. The super learner (SL) is an algorithm that alleviates concerns over selecting the one "right" strategy while providing the freedom to consider many of them, such as those recommended by collaborators, used in related research, or specified by subject-matter experts. It is an entirely pre-specified and data-adaptive strategy for predictive modeling. To ensure the SL is well-specified for learning the prediction function, the analyst does need to make a few important choices. In this Education Corner article, we provide step-by-step guidelines for making these choices, walking the reader through each of them and providing intuition along the way. In doing so, we aim to empower the analyst to tailor the SL specification to their prediction task, thereby ensuring their SL performs as well as possible. A flowchart provides a concise, easy-to-follow summary of key suggestions and heuristics, based on our accumulated experience, and guided by theory.

READ FULL TEXT

page 1

page 6

page 7

research
08/08/2023

SLEM: Machine Learning for Path Modeling and Causal Inference with Super Learner Equation Modeling

Causal inference is a crucial goal of science, enabling researchers to a...
research
03/07/2017

Propensity score prediction for electronic healthcare databases using Super Learner and High-dimensional Propensity Score Methods

The optimal learner for prediction modeling varies depending on the unde...
research
03/11/2021

Causal Learner: A Toolbox for Causal Structure and Markov Blanket Learning

Causal Learner is a toolbox for learning causal structure and Markov bla...
research
05/09/2018

Comparing Covariate Prioritization via Matching to Machine Learning Methods for Causal Inference using Five Empirical Applications

Matching methods have become one frequently used method for statistical ...
research
08/14/2020

Semiparametric Estimation and Inference on Structural Target Functions using Machine Learning and Influence Functions

We aim to construct a class of learning algorithms that are of practical...
research
03/12/2022

The worst of both worlds: A comparative analysis of errors in learning from data in psychology and machine learning

Recent concerns that machine learning (ML) may be facing a reproducibili...
research
01/20/2022

Predictive modeling of movements of refugees and internally displaced people: Towards a computational framework

Predicting forced displacement is an important undertaking of many human...

Please sign up or login with your details

Forgot password? Click here to reset