Adaptive Frequency Filters As Efficient Global Token Mixers

07/26/2023
by   Zhipeng Huang, et al.
0

Recent vision transformers, large-kernel CNNs and MLPs have attained remarkable successes in broad vision tasks thanks to their effective information fusion in the global scope. However, their efficient deployments, especially on mobile devices, still suffer from noteworthy challenges due to the heavy computational costs of self-attention mechanisms, large kernels, or fully connected layers. In this work, we apply conventional convolution theorem to deep learning for addressing this and reveal that adaptive frequency filters can serve as efficient global token mixers. With this insight, we propose Adaptive Frequency Filtering (AFF) token mixer. This neural operator transfers a latent representation to the frequency domain via a Fourier transform and performs semantic-adaptive frequency filtering via an elementwise multiplication, which mathematically equals to a token mixing operation in the original latent space with a dynamic convolution kernel as large as the spatial resolution of this latent representation. We take AFF token mixers as primary neural operators to build a lightweight neural network, dubbed AFFNet. Extensive experiments demonstrate the effectiveness of our proposed AFF token mixer and show that AFFNet achieve superior accuracy and efficiency trade-offs compared to other lightweight network designs on broad visual tasks, including visual recognition and dense prediction tasks.

READ FULL TEXT

page 4

page 15

page 16

research
03/11/2022

ActiveMLP: An MLP-like Architecture with Active Token Mixer

This paper presents ActiveMLP, a general MLP-like backbone for computer ...
research
07/01/2021

Global Filter Networks for Image Classification

Recent advances in self-attention and pure multi-layer perceptrons (MLP)...
research
08/22/2023

SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation

Recent studies show that self-attentions behave like low-pass filters (a...
research
03/22/2023

Multiscale Attention via Wavelet Neural Operators for Vision Transformers

Transformers have achieved widespread success in computer vision. At the...
research
10/08/2021

Token Pooling in Vision Transformers

Despite the recent success in many applications, the high computational ...
research
11/14/2022

ParCNetV2: Oversized Kernel with Enhanced Attention

Transformers have achieved tremendous success in various computer vision...
research
04/26/2023

UniNeXt: Exploring A Unified Architecture for Vision Recognition

Vision Transformers have shown great potential in computer vision tasks....

Please sign up or login with your details

Forgot password? Click here to reset