Dataset Cleaning – A Cross Validation Methodology for Large Facial Datasets using Face Recognition

03/24/2020
by   Viktor Varkarakis, et al.
8

In recent years, large "in the wild" face datasets have been released in an attempt to facilitate progress in tasks such as face detection, face recognition, and other tasks. Most of these datasets are acquired from webpages with automatic procedures. As a consequence, noisy data are often found. Furthermore, in these large face datasets, the annotation of identities is important as they are used for training face recognition algorithms. But due to the automatic way of gathering these datasets and due to their large size, many identities folder contain mislabeled samples which deteriorates the quality of the datasets. In this work, it is presented a semi-automatic method for cleaning the noisy large face datasets with the use of face recognition. This methodology is applied to clean the CelebA dataset and show its effectiveness. Furthermore, the list with the mislabelled samples in the CelebA dataset is made available.

READ FULL TEXT

page 4

page 5

research
05/20/2012

Pilgrims Face Recognition Dataset -- HUFRD

In this work, we define a new pilgrims face recognition dataset, called ...
research
11/03/2022

Seeing the Unseen: Errors and Bias in Visual Datasets

From face recognition in smartphones to automatic routing on self-drivin...
research
07/31/2018

The Devil of Face Recognition is in the Noise

The growing scale of face recognition datasets empowers us to train stro...
research
04/07/2020

A Method for Curation of Web-Scraped Face Image Datasets

Web-scraped, in-the-wild datasets have become the norm in face recogniti...
research
08/15/2020

BroadFace: Looking at Tens of Thousands of People at Once for Face Recognition

The datasets of face recognition contain an enormous number of identitie...
research
01/18/2023

Face Recognition in the age of CLIP Billion image datasets

CLIP (Contrastive Language-Image Pre-training) models developed by OpenA...
research
04/23/2020

Deep Learning Classification With Noisy Labels

Deep Learning systems have shown tremendous accuracy in image classifica...

Please sign up or login with your details

Forgot password? Click here to reset