AI in the Loop – Functionalizing Fold Performance Disagreement to Monitor Automated Medical Image Segmentation Pipelines

05/15/2023
by   Harrison C. Gottlich, et al.
0

Methods for automatically flag poor performing-predictions are essential for safely implementing machine learning workflows into clinical practice and for identifying difficult cases during model training. We present a readily adoptable method using sub-models trained on different dataset folds, where their disagreement serves as a surrogate for model confidence. Thresholds informed by human interobserver values were used to determine whether a final ensemble model prediction would require manual review. In two different datasets (abdominal CT and MR predicting kidney tumors), our framework effectively identified low performing automated segmentations. Flagging images with a minimum Interfold test Dice score below human interobserver variability maximized the number of flagged images while ensuring maximum ensemble test Dice. When our internally trained model was applied to an external publicly available dataset (KiTS21), flagged images included smaller tumors than those observed in our internally trained dataset, demonstrating the methods robustness to flagging poor performing out-of-distribution input data. Comparing interfold sub-model disagreement against human interobserver values is an efficient way to approximate a model's epistemic uncertainty - its lack of knowledge due to insufficient relevant training data - a key functionality for adopting these applications in clinical practice.

READ FULL TEXT
research
07/26/2023

Role of Image Acquisition and Patient Phenotype Variations in Automatic Segmentation Model Generalization

Purpose: This study evaluated the out-of-domain performance and generali...
research
05/24/2021

Brain tumour segmentation using a triplanar ensemble of U-Nets

Gliomas appear with wide variation in their characteristics both in term...
research
07/12/2021

The Power of Proxy Data and Proxy Networks for Hyper-Parameter Optimization in Medical Image Segmentation

Deep learning models for medical image segmentation are primarily data-d...
research
08/31/2021

Uncertainty Quantified Deep Learning for Predicting Dice Coefficient of Digital Histopathology Image Segmentation

Deep learning models (DLMs) can achieve state of the art performance in ...
research
12/04/2020

Statistical inference of the inter-sample Dice distribution for discriminative CNN brain lesion segmentation models

Discriminative convolutional neural networks (CNNs), for which a voxel-w...
research
03/10/2023

Explainable Semantic Medical Image Segmentation with Style

Semantic medical image segmentation using deep learning has recently ach...
research
08/19/2022

Ensemble uncertainty as a criterion for dataset expansion in distinct bone segmentation from upper-body CT images

Purpose: The localisation and segmentation of individual bones is an imp...

Please sign up or login with your details

Forgot password? Click here to reset