Learning Probabilistic Symmetrization for Architecture Agnostic Equivariance

06/05/2023
by   Jinwoo Kim, et al.
0

We present a novel framework to overcome the limitations of equivariant architectures in learning functions with group symmetries. In contrary to equivariant architectures, we use an arbitrary base model (such as an MLP or a transformer) and symmetrize it to be equivariant to the given group by employing a small equivariant network that parameterizes the probabilistic distribution underlying the symmetrization. The distribution is end-to-end trained with the base model which can maximize performance while reducing sample complexity of symmetrization. We show that this approach ensures not only equivariance to given group but also universal approximation capability in expectation. We implement our method on a simple patch-based transformer that can be initialized from pretrained vision transformers, and test it for a wide range of symmetry groups including permutation and Euclidean groups and their combinations. Empirical tests show competitive results against tailored equivariant architectures, suggesting the potential for learning equivariant functions for diverse groups using a non-equivariant universal base architecture. We further show evidence of enhanced learning in symmetric modalities, like graphs, when pretrained from non-symmetric modalities, like vision. Our implementation will be open-sourced at https://github.com/jw9730/lps.

READ FULL TEXT

page 2

page 7

research
04/28/2021

Twins: Revisiting Spatial Attention Design in Vision Transformers

Very recently, a variety of vision transformer architectures for dense p...
research
10/24/2021

CvT-ASSD: Convolutional vision-Transformer Based Attentive Single Shot MultiBox Detector

Due to the success of Bidirectional Encoder Representations from Transfo...
research
03/09/2021

Pretrained Transformers as Universal Computation Engines

We investigate the capability of a transformer pretrained on natural lan...
research
11/26/2022

PatchGT: Transformer over Non-trainable Clusters for Learning Graph Representations

Recently the Transformer structure has shown good performances in graph ...
research
05/08/2022

ConvMAE: Masked Convolution Meets Masked Autoencoders

Vision Transformers (ViT) become widely-adopted architectures for variou...
research
06/16/2022

OmniMAE: Single Model Masked Pretraining on Images and Videos

Transformer-based architectures have become competitive across a variety...
research
05/13/2023

Meta-Polyp: a baseline for efficient Polyp segmentation

In recent years, polyp segmentation has gained significant importance, a...

Please sign up or login with your details

Forgot password? Click here to reset