Assessing Dataset Bias in Computer Vision

05/03/2022
by   Athiya Deviyani, et al.
51

A biased dataset is a dataset that generally has attributes with an uneven class distribution. These biases have the tendency to propagate to the models that train on them, often leading to a poor performance in the minority class. In this project, we will explore the extent to which various data augmentation methods alleviate intrinsic biases within the dataset. We will apply several augmentation techniques on a sample of the UTKFace dataset, such as undersampling, geometric transformations, variational autoencoders (VAEs), and generative adversarial networks (GANs). We then trained a classifier for each of the augmented datasets and evaluated their performance on the native test set and on external facial recognition datasets. We have also compared their performance to the state-of-the-art attribute classifier trained on the FairFace dataset. Through experimentation, we were able to find that training the model on StarGAN-generated images led to the best overall performance. We also found that training on geometrically transformed images lead to a similar performance with a much quicker training time. Additionally, the best performing models also exhibit a uniform performance across the classes within each attribute. This signifies that the model was also able to mitigate the biases present in the baseline model that was trained on the original training set. Finally, we were able to show that our model has a better overall performance and consistency on age and ethnicity classification on multiple datasets when compared with the FairFace model. Our final model has an accuracy on the UTKFace test set of 91.75 ethnicity attribute respectively, with a standard deviation of less than 0.1 between the accuracies of the classes of each attribute.

READ FULL TEXT

page 16

page 23

page 28

page 29

page 30

page 32

page 37

research
02/18/2019

Data augmentation for low resource sentiment analysis using generative adversarial networks

Sentiment analysis is a task that may suffer from a lack of data in cert...
research
09/06/2018

Turning a Blind Eye: Explicit Removal of Biases and Variation from Deep Neural Network Embeddings

Neural networks achieve the state-of-the-art in image classification tas...
research
07/08/2019

Unsupervised Domain Alignment to Mitigate Low Level Dataset Biases

Dataset bias is a well-known problem in the field of computer vision. Th...
research
03/29/2023

Problems and shortcuts in deep learning for screening mammography

This work reveals undiscovered challenges in the performance and general...
research
06/30/2022

The (de)biasing effect of GAN-based augmentation methods on skin lesion images

New medical datasets are now more open to the public, allowing for bette...
research
08/23/2023

A Systematic Study on Quantifying Bias in GAN-Augmented Data

Generative adversarial networks (GANs) have recently become a popular da...
research
01/10/2023

Look Beyond Bias with Entropic Adversarial Data Augmentation

Deep neural networks do not discriminate between spurious and causal pat...

Please sign up or login with your details

Forgot password? Click here to reset