Towards Flexible Inductive Bias via Progressive Reparameterization Scheduling

10/04/2022
by   Yunsung Lee, et al.
0

There are two de facto standard architectures in recent computer vision: Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). Strong inductive biases of convolutions help the model learn sample effectively, but such strong biases also limit the upper bound of CNNs when sufficient data are available. On the contrary, ViT is inferior to CNNs for small data but superior for sufficient data. Recent approaches attempt to combine the strengths of these two architectures. However, we show these approaches overlook that the optimal inductive bias also changes according to the target data scale changes by comparing various models' accuracy on subsets of sampled ImageNet at different ratios. In addition, through Fourier analysis of feature maps, the model's response patterns according to signal frequency changes, we observe which inductive bias is advantageous for each data scale. The more convolution-like inductive bias is included in the model, the smaller the data scale is required where the ViT-like model outperforms the ResNet performance. To obtain a model with flexible inductive bias on the data scale, we show reparameterization can interpolate inductive bias between convolution and self-attention. By adjusting the number of epochs the model stays in the convolution, we show that reparameterization from convolution to self-attention interpolates the Fourier analysis pattern between CNNs and ViTs. Adapting these findings, we propose Progressive Reparameterization Scheduling (PRS), in which reparameterization adjusts the required amount of convolution-like or self-attention-like inductive bias per layer. For small-scale datasets, our PRS performs reparameterization from convolution to self-attention linearly faster at the late stage layer. PRS outperformed previous studies on the small-scale dataset, e.g., CIFAR-100.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2021

ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases

Convolutional architectures have proven extremely successful for vision ...
research
12/07/2021

Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training

Recently, vision Transformers (ViTs) are developing rapidly and starting...
research
02/26/2021

Convolution-Free Medical Image Segmentation using Transformers

Like other applications in computer vision, medical image segmentation h...
research
10/31/2022

Studying inductive biases in image classification task

Recently, self-attention (SA) structures became popular in computer visi...
research
06/09/2021

CoAtNet: Marrying Convolution and Attention for All Data Sizes

Transformers have attracted increasing interests in computer vision, but...
research
10/29/2021

Gabor filter incorporated CNN for compression

Convolutional neural networks (CNNs) are remarkably successful in many c...
research
03/14/2022

EIT: Efficiently Lead Inductive Biases to ViT

Vision Transformer (ViT) depends on properties similar to the inductive ...

Please sign up or login with your details

Forgot password? Click here to reset