Can Adversarial Training Be Manipulated By Non-Robust Features?

01/31/2022
by   Lue Tao, et al.
0

Adversarial training, originally designed to resist test-time adversarial examples, has shown to be promising in mitigating training-time availability attacks. This defense ability, however, is challenged in this paper. We identify a novel threat model named stability attacks, which aims to hinder robust availability by slightly perturbing the training data. Under this threat, we find that adversarial training using a conventional defense budget ϵ provably fails to provide test robustness in a simple statistical setting when the non-robust features of the training data are reinforced by ϵ-bounded perturbation. Further, we analyze the necessity of enlarging the defense budget to counter stability attacks. Finally, comprehensive experiments demonstrate that stability attacks are harmful on benchmark datasets, and thus the adaptive defense is necessary to maintain robustness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/26/2021

What Doesn't Kill You Makes You Robust(er): Adversarial Training against Poisons and Backdoors

Data poisoning is a threat model in which a malicious actor tampers with...
research
02/09/2021

Provable Defense Against Delusive Poisoning

Delusive poisoning is a special kind of attack to obstruct learning, whe...
research
01/31/2023

Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression

Perturbative availability poisoning (PAP) adds small changes to images t...
research
06/16/2022

Analysis and Extensions of Adversarial Training for Video Classification

Adversarial training (AT) is a simple yet effective defense against adve...
research
04/25/2023

Learning Robust Deep Equilibrium Models

Deep equilibrium (DEQ) models have emerged as a promising class of impli...
research
06/17/2021

Adversarial Visual Robustness by Causal Intervention

Adversarial training is the de facto most promising defense against adve...
research
06/12/2023

How robust accuracy suffers from certified training with convex relaxations

Adversarial attacks pose significant threats to deploying state-of-the-a...

Please sign up or login with your details

Forgot password? Click here to reset