Reversible Column Networks

12/22/2022
by   Yuxuan Cai, et al.
0

We propose a new neural network design paradigm Reversible Column Network (RevCol). The main body of RevCol is composed of multiple copies of subnetworks, named columns respectively, between which multi-level reversible connections are employed. Such architectural scheme attributes RevCol very different behavior from conventional networks: during forward propagation, features in RevCol are learned to be gradually disentangled when passing through each column, whose total information is maintained rather than compressed or discarded as other network does. Our experiments suggest that CNN-style RevCol models can achieve very competitive performances on multiple computer vision tasks such as image classification, object detection and semantic segmentation, especially with large parameter budget and large dataset. For example, after ImageNet-22K pre-training, RevCol-XL obtains 88.2 ImageNet-1K accuracy. Given more pre-training data, our largest model RevCol-H reaches 90.0 mIoU on ADE20k segmentation. To our knowledge, it is the best COCO detection and ADE20k segmentation result among pure (static) CNN models. Moreover, as a general macro architecture fashion, RevCol can also be introduced into transformers or other neural networks, which is demonstrated to improve the performances in both computer vision and NLP tasks. We release code and models at https://github.com/megvii-research/RevCol

READ FULL TEXT

page 20

page 23

research
09/02/2023

RevColV2: Exploring Disentangled Representations in Masked Image Modeling

Masked image modeling (MIM) has become a prevalent pre-training setup fo...
research
06/21/2020

FNA++: Fast Network Adaptation via Parameter Remapping and Architecture Search

Deep neural networks achieve remarkable performance in many computer vis...
research
11/24/2021

PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers

This paper explores a better codebook for BERT pre-training of vision tr...
research
11/21/2018

Rethinking ImageNet Pre-training

We report competitive results on object detection and instance segmentat...
research
11/17/2022

Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information

To effectively exploit the potential of large-scale models, various pre-...
research
06/15/2020

Multiscale Deep Equilibrium Models

We propose a new class of implicit networks, the multiscale deep equilib...

Please sign up or login with your details

Forgot password? Click here to reset