Deeper Insights into ViTs Robustness towards Common Corruptions

04/26/2022
by   Rui Tian, et al.
1

Recent literature have shown design strategies from Convolutions Neural Networks (CNNs) benefit Vision Transformers (ViTs) in various vision tasks. However, it remains unclear how these design choices impact on robustness when transferred to ViTs. In this paper, we make the first attempt to investigate how CNN-like architectural designs and CNN-based data augmentation strategies impact on ViTs' robustness towards common corruptions through an extensive and rigorous benchmarking. We demonstrate that overlapping patch embedding and convolutional Feed-Forward Network (FFN) boost performance on robustness. Furthermore, adversarial noise training is powerful on ViTs while fourier-domain augmentation fails. Moreover, we introduce a novel conditional method enabling input-varied augmentations from two angles: (1) Generating dynamic augmentation parameters conditioned on input images. It conduces to state-of-the-art performance on robustness through conditional convolutions; (2) Selecting most suitable augmentation strategy by an extra predictor helps to achieve the best trade-off between clean accuracy and robustness.

READ FULL TEXT

page 8

page 10

page 18

research
11/16/2021

Improved Robustness of Vision Transformer via PreLayerNorm in Patch Embedding

Vision transformers (ViTs) have recently demonstrated state-of-the-art p...
research
12/02/2020

A Self-Supervised Feature Map Augmentation (FMA) Loss and Combined Augmentations Finetuning to Efficiently Improve the Robustness of CNNs

Deep neural networks are often not robust to semantically-irrelevant cha...
research
06/06/2019

Improving Robustness Without Sacrificing Accuracy with Patch Gaussian Augmentation

Deploying machine learning systems in the real world requires both high ...
research
06/21/2019

A Fourier Perspective on Model Robustness in Computer Vision

Achieving robustness to distributional shift is a longstanding and chall...
research
10/15/2021

Understanding and Improving Robustness of Vision Transformers through Patch-based Negative Augmentation

We investigate the robustness of vision transformers (ViTs) through the ...
research
10/14/2022

Optimizing Vision Transformers for Medical Image Segmentation and Few-Shot Domain Adaptation

The adaptation of transformers to computer vision is not straightforward...
research
12/27/2021

PRIME: A Few Primitives Can Boost Robustness to Common Corruptions

Despite their impressive performance on image classification tasks, deep...

Please sign up or login with your details

Forgot password? Click here to reset