The (de)biasing effect of GAN-based augmentation methods on skin lesion images

06/30/2022
by   Agnieszka Mikołajczyk, et al.
0

New medical datasets are now more open to the public, allowing for better and more extensive research. Although prepared with the utmost care, new datasets might still be a source of spurious correlations that affect the learning process. Moreover, data collections are usually not large enough and are often unbalanced. One approach to alleviate the data imbalance is using data augmentation with Generative Adversarial Networks (GANs) to extend the dataset with high-quality images. GANs are usually trained on the same biased datasets as the target data, resulting in more biased instances. This work explored unconditional and conditional GANs to compare their bias inheritance and how the synthetic data influenced the models. We provided extensive manual data annotation of possibly biasing artifacts on the well-known ISIC dataset with skin lesions. In addition, we examined classification models trained on both real and synthetic data with counterfactual bias explanations. Our experiments showed that GANs inherited biases and sometimes even amplified them, leading to even stronger spurious correlations. Manual data annotation and synthetic images are publicly available for reproducible scientific research.

READ FULL TEXT

page 5

page 13

research
04/20/2021

GAN-Based Data Augmentation and Anonymization for Skin-Lesion Analysis: A Critical Review

Despite the growing availability of high-quality public datasets, the la...
research
08/24/2022

GAN-based generative modelling for dermatological applications – comparative study

The lack of sufficiently large open medical databases is one of the bigg...
research
04/12/2018

MelanoGANs: High Resolution Skin Lesion Synthesis with GANs

Generative Adversarial Networks (GANs) have been successfully used to sy...
research
12/20/2022

On the Applicability of Synthetic Data for Re-Identification

This contribution demonstrates the feasibility of applying Generative Ad...
research
07/04/2022

GAN-based generation of realistic 3D data: A systematic review and taxonomy

Data has become the most valuable resource in today's world. With the ma...
research
04/07/2023

Leveraging GANs for data scarcity of COVID-19: Beyond the hype

Artificial Intelligence (AI)-based models can help in diagnosing COVID-1...
research
05/03/2022

Assessing Dataset Bias in Computer Vision

A biased dataset is a dataset that generally has attributes with an unev...

Please sign up or login with your details

Forgot password? Click here to reset