Pyramid Adversarial Training Improves ViT Performance

11/30/2021
by   Charles Herrmann, et al.
7

Aggressive data augmentation is a key component of the strong generalization capabilities of Vision Transformer (ViT). One such data augmentation technique is adversarial training; however, many prior works have shown that this often results in poor clean accuracy. In this work, we present Pyramid Adversarial Training, a simple and effective technique to improve ViT's overall performance. We pair it with a "matched" Dropout and stochastic depth regularization, which adopts the same Dropout and stochastic depth configuration for the clean and adversarial samples. Similar to the improvements on CNNs by AdvProp (not directly applicable to ViT), our Pyramid Adversarial Training breaks the trade-off between in-distribution accuracy and out-of-distribution robustness for ViT and related architectures. It leads to 1.82% absolute improvement on ImageNet clean accuracy for the ViT-B model when trained only on ImageNet-1K data, while simultaneously boosting performance on 7 ImageNet robustness metrics, by absolute numbers ranging from 1.76% to 11.45%. We set a new state-of-the-art for ImageNet-C (41.4 mCE), ImageNet-R (53.92%), and ImageNet-Sketch (41.04%) without extra data, using only the ViT-B/16 backbone and our Pyramid Adversarial Training. Our code will be publicly available upon acceptance.

READ FULL TEXT

page 21

page 22

page 25

page 26

page 27

page 28

page 29

page 30

research
03/03/2021

On the effectiveness of adversarial training against common corruptions

The literature on robustness towards common corruptions shows no consens...
research
03/20/2020

Adversarial Robustness on In- and Out-Distribution Improves Explainability

Neural networks have led to major improvements in image classification b...
research
09/15/2022

A Light Recipe to Train Robust Vision Transformers

In this paper, we ask whether Vision Transformers (ViTs) can serve as an...
research
12/09/2021

PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures

In real-world applications of machine learning, reliable and safe system...
research
10/26/2021

AugMax: Adversarial Composition of Random Augmentations for Robust Training

Data augmentation is a simple yet effective way to improve the robustnes...
research
05/03/2022

Better plain ViT baselines for ImageNet-1k

It is commonly accepted that the Vision Transformer model requires sophi...
research
10/10/2022

Revisiting adapters with adversarial training

While adversarial training is generally used as a defense mechanism, rec...

Please sign up or login with your details

Forgot password? Click here to reset