The Lie Derivative for Measuring Learned Equivariance

10/06/2022
by   Nate Gruver, et al.
0

Equivariance guarantees that a model's predictions capture key symmetries in data. When an image is translated or rotated, an equivariant model's representation of that image will translate or rotate accordingly. The success of convolutional neural networks has historically been tied to translation equivariance directly encoded in their architecture. The rising success of vision transformers, which have no explicit architectural bias towards equivariance, challenges this narrative and suggests that augmentations and training data might also play a significant role in their performance. In order to better understand the role of equivariance in recent vision models, we introduce the Lie derivative, a method for measuring equivariance with strong mathematical foundations and minimal hyperparameters. Using the Lie derivative, we study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures. The scale of our analysis allows us to separate the impact of architecture from other factors like model size or training method. Surprisingly, we find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities, and that as models get larger and more accurate they tend to display more equivariance, regardless of architecture. For example, transformers can be more equivariant than convolutional neural networks after training.

READ FULL TEXT
research
06/07/2022

Can CNNs Be More Robust Than Transformers?

The recent success of Vision Transformers is shaking the long dominance ...
research
09/15/2023

Biased Attention: Do Vision Transformers Amplify Gender Bias More than Convolutional Neural Networks?

Deep neural networks used in computer vision have been shown to exhibit ...
research
05/28/2022

WaveMix-Lite: A Resource-efficient Neural Network for Image Analysis

Gains in the ability to generalize on image analysis tasks for neural ne...
research
06/01/2022

A comparative study between vision transformers and CNNs in digital pathology

Recently, vision transformers were shown to be capable of outperforming ...
research
06/18/2021

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

Vision Transformers (ViT) have been shown to attain highly competitive p...
research
04/25/2023

iMixer: hierarchical Hopfield network implies an invertible, implicit and iterative MLP-Mixer

In the last few years, the success of Transformers in computer vision ha...
research
11/29/2022

RGB no more: Minimally-decoded JPEG Vision Transformers

Most neural networks for computer vision are designed to infer using RGB...

Please sign up or login with your details

Forgot password? Click here to reset