Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results

04/07/2022
by   Tal Ridnik, et al.
19

ImageNet serves as the primary dataset for evaluating the quality of computer-vision models. The common practice today is training each architecture with a tailor-made scheme, designed and tuned by an expert. In this paper, we present a unified scheme for training any backbone on ImageNet. The scheme, named USI (Unified Scheme for ImageNet), is based on knowledge distillation and modern tricks. It requires no adjustments or hyper-parameters tuning between different models, and is efficient in terms of training times. We test USI on a wide variety of architectures, including CNNs, Transformers, Mobile-oriented and MLP-only. On all models tested, USI outperforms previous state-of-the-art results. Hence, we are able to transform training on ImageNet from an expert-oriented task to an automatic seamless routine. Since USI accepts any backbone and trains it to top results, it also enables to perform methodical comparisons, and identify the most efficient backbones along the speed-accuracy Pareto curve. Implementation is available at:https://github.com/Alibaba-MIIL/Solving_ImageNet

READ FULL TEXT
research
04/22/2021

ImageNet-21K Pretraining for the Masses

ImageNet-1K serves as the primary dataset for pretraining deep learning ...
research
09/30/2022

MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input Features

MobileViT (MobileViTv1) combines convolutional neural networks (CNNs) an...
research
11/09/2022

Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation

Audio Spectrogram Transformer models rule the field of Audio Tagging, ou...
research
05/30/2023

Are Large Kernels Better Teachers than Transformers for ConvNets?

This paper reveals a new appeal of the recently emerged large-kernel Con...
research
09/17/2020

MEAL V2: Boosting Vanilla ResNet-50 to 80 without Tricks

In this paper, we introduce a simple yet effective approach that can boo...
research
04/26/2023

UniNeXt: Exploring A Unified Architecture for Vision Recognition

Vision Transformers have shown great potential in computer vision tasks....
research
03/30/2021

Automated Cleanup of the ImageNet Dataset by Model Consensus, Explainability and Confident Learning

The convolutional neural networks (CNNs) trained on ILSVRC12 ImageNet we...

Please sign up or login with your details

Forgot password? Click here to reset