Attacks Which Do Not Kill Training Make Adversarial Learning Stronger

02/26/2020
by   Jingfeng Zhang, et al.
8

Adversarial training based on the minimax formulation is necessary for obtaining adversarial robustness of trained models. However, it is conservative or even pessimistic so that it sometimes hurts the natural generalization. In this paper, we raise a fundamental question—do we have to trade off natural generalization for adversarial robustness? We argue that adversarial training is to employ confident adversarial data for updating the current model. We propose a novel approach of friendly adversarial training (FAT): rather than employing most adversarial data maximizing the loss, we search for least adversarial (i.e., friendly adversarial) data minimizing the loss, among the adversarial data that are confidently misclassified. Our novel formulation is easy to implement by just stopping the most adversarial data searching algorithms such as PGD (projected gradient descent) early, which we call early-stopped PGD. Theoretically, FAT is justified by an upper bound of the adversarial risk. Empirically, early-stopped PGD allows us to answer the earlier question negatively—adversarial robustness can indeed be achieved without compromising the natural generalization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/14/2022

Efficient Adversarial Training with Robust Early-Bird Tickets

Adversarial training is one of the most powerful methods to improve the ...
research
06/07/2022

Adaptive Regularization for Adversarial Training

Adversarial training, which is to enhance robustness against adversarial...
research
04/19/2022

Poisons that are learned faster are more effective

Imperceptible poisoning attacks on entire datasets have recently been to...
research
09/10/2020

Quantifying the Preferential Direction of the Model Gradient in Adversarial Training With Projected Gradient Descent

Adversarial training, especially projected gradient descent (PGD), has b...
research
04/28/2022

Improving robustness of language models from a geometry-aware perspective

Recent studies have found that removing the norm-bounded projection and ...
research
05/31/2021

NoiLIn: Do Noisy Labels Always Hurt Adversarial Training?

Adversarial training (AT) based on minimax optimization is a popular lea...
research
06/13/2023

On Achieving Optimal Adversarial Test Error

We first elucidate various fundamental properties of optimal adversarial...

Please sign up or login with your details

Forgot password? Click here to reset