Analysis and Applications of Class-wise Robustness in Adversarial Training

05/29/2021
by   Qi Tian, et al.
0

Adversarial training is one of the most effective approaches to improve model robustness against adversarial examples. However, previous works mainly focus on the overall robustness of the model, and the in-depth analysis on the role of each class involved in adversarial training is still missing. In this paper, we propose to analyze the class-wise robustness in adversarial training. First, we provide a detailed diagnosis of adversarial training on six benchmark datasets, i.e., MNIST, CIFAR-10, CIFAR-100, SVHN, STL-10 and ImageNet. Surprisingly, we find that there are remarkable robustness discrepancies among classes, leading to unbalance/unfair class-wise robustness in the robust models. Furthermore, we keep investigating the relations between classes and find that the unbalanced class-wise robustness is pretty consistent among different attack and defense methods. Moreover, we observe that the stronger attack methods in adversarial learning achieve performance improvement mainly from a more successful attack on the vulnerable classes (i.e., classes with less robustness). Inspired by these interesting findings, we design a simple but effective attack method based on the traditional PGD attack, named Temperature-PGD attack, which proposes to enlarge the robustness disparity among classes with a temperature factor on the confidence distribution of each image. Experiments demonstrate our method can achieve a higher attack rate than the PGD attack. Furthermore, from the defense perspective, we also make some modifications in the training and inference phases to improve the robustness of the most vulnerable class, so as to mitigate the large difference in class-wise robustness. We believe our work can contribute to a more comprehensive understanding of adversarial training as well as rethinking the class-wise properties in robust models.

READ FULL TEXT

page 3

page 6

page 7

research
04/07/2021

Universal Adversarial Training with Class-Wise Perturbations

Despite their overwhelming success on a wide range of applications, conv...
research
02/08/2023

WAT: Improve the Worst-class Robustness in Adversarial Training

Deep Neural Networks (DNN) have been shown to be vulnerable to adversari...
research
01/29/2021

Adversarial Learning with Cost-Sensitive Classes

It is necessary to improve the performance of some special classes or to...
research
05/16/2023

Releasing Inequality Phenomena in L_∞-Adversarial Training via Input Gradient Distillation

Since adversarial examples appeared and showed the catastrophic degradat...
research
09/15/2022

Improving Robust Fairness via Balance Adversarial Training

Adversarial training (AT) methods are effective against adversarial atta...
research
05/23/2023

Enhancing Accuracy and Robustness through Adversarial Training in Class Incremental Continual Learning

In real life, adversarial attack to deep learning models is a fatal secu...
research
02/10/2021

CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

We investigate the adversarial robustness of CNNs from the perspective o...

Please sign up or login with your details

Forgot password? Click here to reset