Bayesian Sampling Bias Correction: Training with the Right Loss Function

06/24/2020
by   L. Le Folgoc, et al.
0

We derive a family of loss functions to train models in the presence of sampling bias. Examples are when the prevalence of a pathology differs from its sampling rate in the training dataset, or when a machine learning practioner rebalances their training dataset. Sampling bias causes large discrepancies between model performance in the lab and in more realistic settings. It is omnipresent in medical imaging applications, yet is often overlooked at training time or addressed on an ad-hoc basis. Our approach is based on Bayesian risk minimization. For arbitrary likelihood models we derive the associated bias corrected loss for training, exhibiting a direct connection to information gain. The approach integrates seamlessly in the current paradigm of (deep) learning using stochastic backpropagation and naturally with Bayesian models. We illustrate the methodology on case studies of lung nodule malignancy grading.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/31/2021

Bayesian analysis of the prevalence bias: learning and predicting from imbalanced data

Datasets are rarely a realistic approximation of the target population. ...
research
03/11/2022

Sampling Bias Correction for Supervised Machine Learning: A Bayesian Inference Approach with Practical Applications

Given a supervised machine learning problem where the training set has b...
research
12/03/2018

Learning to Unlearn: Building Immunity to Dataset Bias in Medical Imaging Studies

Medical imaging machine learning algorithms are usually evaluated on a s...
research
11/05/2021

Increasing Fairness in Predictions Using Bias Parity Score Based Loss Function Regularization

Increasing utilization of machine learning based decision support system...
research
06/28/2019

Learning Effective Loss Functions Efficiently

We consider the problem of learning a loss function which, when minimize...
research
07/16/2023

Dataset Distillation Meets Provable Subset Selection

Deep learning has grown tremendously over recent years, yielding state-o...
research
07/23/2022

Density-Aware Personalized Training for Risk Prediction in Imbalanced Medical Data

Medical events of interest, such as mortality, often happen at a low rat...

Please sign up or login with your details

Forgot password? Click here to reset