Deep Hypothesis Tests Detect Clinically Relevant Subgroup Shifts in Medical Images

03/08/2023
by   Lisa M. Koch, et al.
0

Distribution shifts remain a fundamental problem for the safe application of machine learning systems. If undetected, they may impact the real-world performance of such systems or will at least render original performance claims invalid. In this paper, we focus on the detection of subgroup shifts, a type of distribution shift that can occur when subgroups have a different prevalence during validation compared to the deployment setting. For example, algorithms developed on data from various acquisition settings may be predominantly applied in hospitals with lower quality data acquisition, leading to an inadvertent performance drop. We formulate subgroup shift detection in the framework of statistical hypothesis testing and show that recent state-of-the-art statistical tests can be effectively applied to subgroup shift detection on medical imaging data. We provide synthetic experiments as well as extensive evaluation on clinically meaningful subgroup shifts on histopathology as well as retinal fundus images. We conclude that classifier-based subgroup shift detection tests could be a particularly useful tool for post-market surveillance of deployed ML systems.

READ FULL TEXT

page 1

page 2

page 5

page 6

page 8

research
06/28/2021

Ensembling Shift Detectors: an Extensive Empirical Evaluation

The term dataset shift refers to the situation where the data used to tr...
research
04/18/2021

Failing Conceptually: Concept-Based Explanations of Dataset Shift

Despite their remarkable performance on a wide range of visual tasks, ma...
research
05/22/2023

MAGDiff: Covariate Data Set Shift Detection via Activation Graphs of Deep Neural Networks

Despite their successful application to a variety of tasks, neural netwo...
research
07/07/2021

Test for non-negligible adverse shifts

Statistical tests for dataset shift are susceptible to false alarms: the...
research
03/22/2023

Deployment of Image Analysis Algorithms under Prevalence Shifts

Domain gaps are among the most relevant roadblocks in the clinical trans...
research
10/29/2018

Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift

We might hope that when faced with unexpected inputs, well-designed soft...
research
07/27/2023

Towards Practicable Sequential Shift Detectors

There is a growing awareness of the harmful effects of distribution shif...

Please sign up or login with your details

Forgot password? Click here to reset