Statistical quantification of confounding bias in predictive modelling

11/01/2021
by   Tamas Spisak, et al.
6

The lack of non-parametric statistical tests for confounding bias significantly hampers the development of robust, valid and generalizable predictive models in many fields of research. Here I propose the partial and full confounder tests, which, for a given confounder variable, probe the null hypotheses of unconfounded and fully confounded models, respectively. The tests provide a strict control for Type I errors and high statistical power, even for non-normally and non-linearly dependent predictions, often seen in machine learning. Applying the proposed tests on models trained on functional brain connectivity data from the Human Connectome Project and the Autism Brain Imaging Data Exchange dataset reveals confounders that were previously unreported or found to be hard to correct for with state-of-the-art confound mitigation approaches. The tests, implemented in the package mlconfound (https://mlconfound.readthedocs.io), can aid the assessment and improvement of the generalizability and neurobiological validity of predictive models and, thereby, foster the development of clinically useful machine learning biomarkers.

READ FULL TEXT

page 4

page 25

page 26

page 27

research
05/18/2018

Using permutations to quantify and correct for confounding in machine learning predictions

Clinical machine learning applications are often plagued with confounder...
research
04/10/2019

Tea: A High-level Language and Runtime System for Automating Statistical Analysis

Though statistical analyses are centered on research questions and hypot...
research
04/12/2022

Detection and Mitigation of Algorithmic Bias via Predictive Rate Parity

Recently, numerous studies have demonstrated the presence of bias in mac...
research
10/04/2019

A Rademacher Complexity Based Method fo rControlling Power and Confidence Level in Adaptive Statistical Analysis

While standard statistical inference techniques and machine learning gen...
research
02/06/2020

Small sample corrections for Wald tests in Latent Variable Models

Latent variable models (LVMs) are commonly used in psychology and increa...
research
09/20/2023

Inference-based statistical network analysis uncovers star-like brain functional architectures for internalizing psychopathology in children

To improve the statistical power for imaging biomarker detection, we pro...

Please sign up or login with your details

Forgot password? Click here to reset