Prevention is better than cure: a case study of the abnormalities detection in the chest

05/18/2023
by   Weronika Hryniewska, et al.
0

Prevention is better than cure. This old truth applies not only to the prevention of diseases but also to the prevention of issues with AI models used in medicine. The source of malfunctioning of predictive models often lies not in the training process but reaches the data acquisition phase or design of the experiment phase. In this paper, we analyze in detail a single use case - a Kaggle competition related to the detection of abnormalities in X-ray lung images. We demonstrate how a series of simple tests for data imbalance exposes faults in the data acquisition and annotation process. Complex models are able to learn such artifacts and it is difficult to remove this bias during or after the training. Errors made at the data collection stage make it difficult to validate the model correctly. Based on this use case, we show how to monitor data and model balance (fairness) throughout the life cycle of a predictive model, from data acquisition to parity analysis of model scores.

READ FULL TEXT

page 2

page 3

page 4

page 6

research
05/13/2020

Context Learning for Bone Shadow Exclusion in CheXNet Accuracy Improvement

Chest X-ray examination plays an important role in lung disease detectio...
research
07/05/2021

Detecting Faults during Automatic Screwdriving: A Dataset and Use Case of Anomaly Detection for Automatic Screwdriving

Detecting faults in manufacturing applications can be difficult, especia...
research
05/30/2018

Why Is My Classifier Discriminatory?

Recent attempts to achieve fairness in predictive models focus on the ba...
research
03/24/2022

X-ray Dissectography Improves Lung Nodule Detection

Although radiographs are the most frequently used worldwide due to their...
research
03/05/2022

Rib Suppression in Digital Chest Tomosynthesis

Digital chest tomosynthesis (DCT) is a technique to produce sectional 3D...
research
07/27/2018

CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance

We introduce CASED, a novel curriculum sampling algorithm that facilitat...
research
09/13/2023

Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs

Most interpretability research in NLP focuses on understanding the behav...

Please sign up or login with your details

Forgot password? Click here to reset