Exposing and addressing the fragility of neural networks in digital pathology

by   Joona Pohjonen, et al.

Neural networks have achieved impressive results in many medical imaging tasks but often perform substantially worse on out-of-distribution datasets originating from different medical centres or patient cohorts. Evaluating this lack of ability to generalise and address the underlying problem are the two main challenges in developing neural networks intended for clinical practice. In this study, we develop a new method for evaluating neural network models' ability to generalise by generating a large number of distribution-shifted datasets, which can be used to thoroughly investigate their robustness to variability encountered in clinical practice. Compared to external validation, shifted evaluation can provide explanations for why neural networks fail on a given dataset, thus offering guidance on how to improve model robustness. With shifted evaluation, we demonstrate that neural networks, trained with state-of-the-art methods, are highly fragile to even small distribution shifts from training data, and in some cases lose all discrimination ability. To address this fragility, we develop an augmentation strategy, explicitly designed to increase neural networks' robustness to distribution shifts. is evaluated with large-scale, heterogeneous histopathology data including five training datasets from two tissue types, 274 distribution-shifted datasets and 20 external datasets from four countries. Neural networks trained with retain similar performance on all datasets, even with distribution shifts where networks trained with current state-of-the-art methods lose all discrimination ability. We recommend using strong augmentation and shifted evaluation to train and evaluate all neural networks intended for clinical practice.


page 3

page 4

page 8

page 13

page 14


Spectral decoupling allows training transferable neural networks in medical imaging

Deep neural networks show impressive performance in medical imaging task...

A Fine-Grained Analysis on Distribution Shift

Robustness to distribution shifts is critical for deploying machine lear...

The reliability of a deep learning model in clinical out-of-distribution MRI data: a multicohort study

Deep learning (DL) methods have in recent years yielded impressive resul...

Noisy Learning for Neural ODEs Acts as a Robustness Locus Widening

We investigate the problems and challenges of evaluating the robustness ...

Data augmentation for deep learning based accelerated MRI reconstruction with limited data

Deep neural networks have emerged as very successful tools for image res...

ROOD-MRI: Benchmarking the robustness of deep learning segmentation models to out-of-distribution and corrupted data in MRI

Deep artificial neural networks (DNNs) have moved to the forefront of me...

Take Me Home: Reversing Distribution Shifts using Reinforcement Learning

Deep neural networks have repeatedly been shown to be non-robust to the ...

Please sign up or login with your details

Forgot password? Click here to reset