Co-advise: Cross Inductive Bias Distillation

06/23/2021
by   Sucheng Ren, et al.
7

Transformers recently are adapted from the community of natural language processing as a promising substitute of convolution-based neural networks for visual learning tasks. However, its supremacy degenerates given an insufficient amount of training data (e.g., ImageNet). To make it into practical utility, we propose a novel distillation-based method to train vision transformers. Unlike previous works, where merely heavy convolution-based teachers are provided, we introduce lightweight teachers with different architectural inductive biases (e.g., convolution and involution) to co-advise the student transformer. The key is that teachers with different inductive biases attain different knowledge despite that they are trained on the same dataset, and such different knowledge compounds and boosts the student's performance during distillation. Equipped with this cross inductive bias distillation method, our vision transformers (termed as CivT) outperform all previous transformers of the same architecture on ImageNet.

READ FULL TEXT
research
12/23/2020

Training data-efficient image transformers distillation through attention

Recently, neural networks purely based on attention were shown to addres...
research
05/31/2020

Transferring Inductive Biases through Knowledge Distillation

Having the right inductive biases can be crucial in many tasks or scenar...
research
06/11/2023

2-D SSM: A General Spatial Layer for Visual Transformers

A central objective in computer vision is to design models with appropri...
research
07/21/2022

Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?

There have been a lot of interest in the scaling properties of Transform...
research
10/11/2021

Leveraging Transformers for StarCraft Macromanagement Prediction

Inspired by the recent success of transformers in natural language proce...
research
07/17/2023

Cumulative Spatial Knowledge Distillation for Vision Transformers

Distilling knowledge from convolutional neural networks (CNNs) is a doub...
research
06/15/2022

SP-ViT: Learning 2D Spatial Priors for Vision Transformers

Recently, transformers have shown great potential in image classificatio...

Please sign up or login with your details

Forgot password? Click here to reset