Can CNNs Be More Robust Than Transformers?

06/07/2022
by   Zeyu Wang, et al.
82

The recent success of Vision Transformers is shaking the long dominance of Convolutional Neural Networks (CNNs) in image recognition for a decade. Specifically, in terms of robustness on out-of-distribution samples, recent research finds that Transformers are inherently more robust than CNNs, regardless of different training setups. Moreover, it is believed that such superiority of Transformers should largely be credited to their self-attention-like architectures per se. In this paper, we question that belief by closely examining the design of Transformers. Our findings lead to three highly effective architecture designs for boosting robustness, yet simple enough to be implemented in several lines of code, namely a) patchifying input images, b) enlarging kernel size, and c) reducing activation layers and normalization layers. Bringing these components together, we are able to build pure CNN architectures without any attention-like operations that is as robust as, or even more robust than, Transformers. We hope this work can help the community better understand the design of robust neural architectures. The code is publicly available at https://github.com/UCSC-VLAA/RobustCNN.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/10/2021

Are Transformers More Robust Than CNNs?

Transformer emerges as a powerful tool for visual recognition. In additi...
research
06/24/2021

Exploring Corruption Robustness: Inductive Biases in Vision Transformers and MLP-Mixers

Recently, vision transformers and MLP-based models have been developed i...
research
03/19/2022

CNNs and Transformers Perceive Hybrid Images Similar to Humans

Hybrid images is a technique to generate images with two interpretations...
research
10/11/2022

Curved Representation Space of Vision Transformers

Neural networks with self-attention (a.k.a. Transformers) like ViT and S...
research
10/20/2022

How Does a Deep Learning Model Architecture Impact Its Privacy?

As a booming research area in the past decade, deep learning technologie...
research
10/06/2022

The Lie Derivative for Measuring Learned Equivariance

Equivariance guarantees that a model's predictions capture key symmetrie...
research
07/21/2022

SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks

Recent isotropic networks, such as ConvMixer and vision transformers, ha...

Please sign up or login with your details

Forgot password? Click here to reset