Local Reweighting for Adversarial Training

06/30/2021
by   Ruize Gao, et al.
0

Instances-reweighted adversarial training (IRAT) can significantly boost the robustness of trained models, where data being less/more vulnerable to the given attack are assigned smaller/larger weights during training. However, when tested on attacks different from the given attack simulated in training, the robustness may drop significantly (e.g., even worse than no reweighting). In this paper, we study this problem and propose our solution–locally reweighted adversarial training (LRAT). The rationale behind IRAT is that we do not need to pay much attention to an instance that is already safe under the attack. We argue that the safeness should be attack-dependent, so that for the same instance, its weight can change given different attacks based on the same model. Thus, if the attack simulated in training is mis-specified, the weights of IRAT are misleading. To this end, LRAT pairs each instance with its adversarial variants and performs local reweighting inside each pair, while performing no global reweighting–the rationale is to fit the instance itself if it is immune to the attack, but not to skip the pair, in order to passively defend different attacks in future. Experiments show that LRAT works better than both IRAT (i.e., global reweighting) and the standard AT (i.e., no reweighting) when trained with an attack and tested on different attacks.

READ FULL TEXT
research
10/05/2020

Geometry-aware Instance-reweighted Adversarial Training

In adversarial machine learning, there was a common belief that robustne...
research
06/15/2021

Probabilistic Margins for Instance Reweighting in Adversarial Training

Reweighting adversarial data during training has been recently shown to ...
research
06/19/2019

Global Adversarial Attacks for Assessing Deep Learning Robustness

It has been shown that deep neural networks (DNNs) may be vulnerable to ...
research
03/02/2021

Evaluating the Robustness of Geometry-Aware Instance-Reweighted Adversarial Training

In this technical report, we evaluate the adversarial robustness of a ve...
research
06/02/2022

Robustness Evaluation and Adversarial Training of an Instance Segmentation Model

To evaluate the robustness of non-classifier models, we propose probabil...
research
06/18/2022

DECK: Model Hardening for Defending Pervasive Backdoors

Pervasive backdoors are triggered by dynamic and pervasive input perturb...
research
04/09/2020

Rethinking the Trigger of Backdoor Attack

In this work, we study the problem of backdoor attacks, which add a spec...

Please sign up or login with your details

Forgot password? Click here to reset