DynaMixer: A Vision MLP Architecture with Dynamic Mixing

01/28/2022
by   Ziyu Wang, et al.
5

Recently, MLP-like vision models have achieved promising performances on mainstream visual recognition tasks. In contrast with vision transformers and CNNs, the success of MLP-like models shows that simple information fusion operations among tokens and channels can yield a good representation power for deep recognition models. However, existing MLP-like models fuse tokens through static fusion operations, lacking adaptability to the contents of the tokens to be mixed. Thus, customary information fusion procedures are not effective enough. To this end, this paper presents an efficient MLP-like network architecture, dubbed DynaMixer, resorting to dynamic information fusion. Critically, we propose a procedure, on which the DynaMixer model relies, to dynamically generate mixing matrices by leveraging the contents of all the tokens to be mixed. To reduce the time complexity and improve the robustness, a dimensionality reduction technique and a multi-segment fusion mechanism are adopted. Our proposed DynaMixer model (97M parameters) achieves 84.3% top-1 accuracy on the ImageNet-1K dataset without extra training data, performing favorably against the state-of-the-art vision MLP models. When the number of parameters is reduced to 26M, it still achieves 82.7% top-1 accuracy, surpassing the existing MLP-like models with a similar capacity. The implementation of DynaMixer will be made available to the public.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2022

SplitMixer: Fat Trimmed From MLP-like Models

We present SplitMixer, a simple and lightweight isotropic MLP-like archi...
research
08/30/2021

Hire-MLP: Vision MLP via Hierarchical Rearrangement

This paper presents Hire-MLP, a simple yet competitive vision MLP archit...
research
05/17/2023

CageViT: Convolutional Activation Guided Efficient Vision Transformer

Recently, Transformers have emerged as the go-to architecture for both v...
research
03/11/2022

ActiveMLP: An MLP-like Architecture with Active Token Mixer

This paper presents ActiveMLP, a general MLP-like backbone for computer ...
research
06/13/2022

MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing

Convolutional Neural Networks (CNNs) have been regarded as the go-to mod...
research
05/27/2022

MixMIM: Mixed and Masked Image Modeling for Efficient Visual Representation Learning

In this study, we propose Mixed and Masked Image Modeling (MixMIM), a si...
research
10/15/2022

MIXER: Multiattribute, Multiway Fusion of Uncertain Pairwise Affinities

We present a multiway fusion algorithm capable of directly processing un...

Please sign up or login with your details

Forgot password? Click here to reset