Automated Cleanup of the ImageNet Dataset by Model Consensus, Explainability and Confident Learning

03/30/2021
by   Csaba Kertész, et al.
6

The convolutional neural networks (CNNs) trained on ILSVRC12 ImageNet were the backbone of various applications as a generic classifier, a feature extractor or a base model for transfer learning. This paper describes automated heuristics based on model consensus, explainability and confident learning to correct labeling mistakes and remove ambiguous images from this dataset. After making these changes on the training and validation sets, the ImageNet-Clean improves the model performance by 2-2.4 models. The results support the importance of larger image corpora and semi-supervised learning, but the original datasets must be fixed to avoid transmitting their mistakes and biases to the student learner. Further contributions describe the training impacts of widescreen input resolutions in portrait and landscape orientations. The trained models and scripts are published on Github (https://github.com/kecsap/imagenet-clean) to clean up ImageNet and ImageNetV2 datasets for reproducible research.

READ FULL TEXT

page 2

page 3

page 11

page 12

page 13

research
12/05/2016

ImageNet pre-trained models with batch normalization

Convolutional neural networks (CNN) pre-trained on ImageNet are the back...
research
07/13/2020

Learning to Learn Parameterized Classification Networks for Scalable Input Images

Convolutional Neural Networks (CNNs) do not have a predictable recogniti...
research
08/13/2023

SimMatchV2: Semi-Supervised Learning with Graph Consistency

Semi-Supervised image classification is one of the most fundamental prob...
research
07/27/2017

A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets

The original ImageNet dataset is a popular large-scale benchmark for tra...
research
09/30/2022

MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input Features

MobileViT (MobileViTv1) combines convolutional neural networks (CNNs) an...
research
04/07/2022

Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results

ImageNet serves as the primary dataset for evaluating the quality of com...
research
07/17/2022

Performance degradation of ImageNet trained models by simple image transformations

ImageNet trained PyTorch models are generally preferred as the off-the-s...

Please sign up or login with your details

Forgot password? Click here to reset