Zero-shot racially balanced dataset generation using an existing biased StyleGAN2

05/12/2023
by   Anubhav Jain, et al.
0

Facial recognition systems have made significant strides thanks to data-heavy deep learning models, but these models rely on large privacy-sensitive datasets. Unfortunately, many of these datasets lack diversity in terms of ethnicity and demographics, which can lead to biased models that can have serious societal and security implications. To address these issues, we propose a methodology that leverages the biased generative model StyleGAN2 to create demographically diverse images of synthetic individuals. The synthetic dataset is created using a novel evolutionary search algorithm that targets specific demographic groups. By training face recognition models with the resulting balanced dataset containing 50,000 identities per race (13.5 million images in total), we can improve their performance and minimize biases that might have been present in a model trained on a real dataset.

READ FULL TEXT

page 1

page 3

research
04/17/2023

A Real Balanced Dataset For Understanding Bias? Factors That Impact Accuracy, Not Numbers of Identities and Images

The issue of disparities in face recognition accuracy across demographic...
research
09/15/2023

Toward responsible face datasets: modeling the distribution of a disentangled latent space for sampling face images from demographic groups

Recently, it has been exposed that some modern facial recognition system...
research
10/11/2021

EDFace-Celeb-1M: Benchmarking Face Hallucination with a Million-scale Dataset

Recent deep face hallucination methods show stunning performance in supe...
research
09/19/2022

Fairness on Synthetic Visual and Thermal Mask Images

In this paper, we study performance and fairness on visual and thermal i...
research
12/05/2022

A Dataless FaceSwap Detection Approach Using Synthetic Images

Face swapping technology used to create "Deepfakes" has advanced signifi...
research
08/07/2023

Balanced Face Dataset: Guiding StyleGAN to Generate Labeled Synthetic Face Image Dataset for Underrepresented Group

For a machine learning model to generalize effectively to unseen data wi...
research
12/08/2021

A study of Biases in Common Face Recognition Datasets

The performance of commonly used facial recognition datasets have been s...

Please sign up or login with your details

Forgot password? Click here to reset