iMixer: hierarchical Hopfield network implies an invertible, implicit and iterative MLP-Mixer

04/25/2023
by   Toshihiro Ota, et al.
0

In the last few years, the success of Transformers in computer vision has stimulated the discovery of many alternative models that compete with Transformers, such as the MLP-Mixer. Despite their weak induced bias, these models have achieved performance comparable to well-studied convolutional neural networks. Recent studies on modern Hopfield networks suggest the correspondence between certain energy-based associative memory models and Transformers or MLP-Mixer, and shed some light on the theoretical background of the Transformer-type architectures design. In this paper we generalize the correspondence to the recently introduced hierarchical Hopfield network, and find iMixer, a novel generalization of MLP-Mixer model. Unlike ordinary feedforward neural networks, iMixer involves MLP layers that propagate forward from the output side to the input side. We characterize the module as an example of invertible, implicit, and iterative mixing module. We evaluate the model performance with various datasets on image classification tasks, and find that iMixer reasonably achieves the improvement compared to the baseline vanilla MLP-Mixer. The results imply that the correspondence between the Hopfield networks and the Mixer models serves as a principle for understanding a broader class of Transformer-like architecture designs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/26/2021

Understanding Robustness of Transformers for Image Classification

Deep Convolutional Neural Networks (CNNs) have long been the architectur...
research
11/22/2021

MetaFormer is Actually What You Need for Vision

Transformers have shown great potential in computer vision tasks. A comm...
research
05/04/2021

MLP-Mixer: An all-MLP Architecture for Vision

Convolutional Neural Networks (CNNs) are the go-to model for computer vi...
research
11/22/2021

DBIA: Data-free Backdoor Injection Attack against Transformer Networks

Recently, transformer architecture has demonstrated its significance in ...
research
10/25/2021

Gophormer: Ego-Graph Transformer for Node Classification

Transformers have achieved remarkable performance in a myriad of fields ...
research
02/14/2023

Energy Transformer

Transformers have become the de facto models of choice in machine learni...
research
10/06/2022

The Lie Derivative for Measuring Learned Equivariance

Equivariance guarantees that a model's predictions capture key symmetrie...

Please sign up or login with your details

Forgot password? Click here to reset