Semantics-Preserving Adversarial Training

09/23/2020
by   Wonseok Lee, et al.
0

Adversarial training is a defense technique that improves adversarial robustness of a deep neural network (DNN) by including adversarial examples in the training data. In this paper, we identify an overlooked problem of adversarial training in that these adversarial examples often have different semantics than the original data, introducing unintended biases into the model. We hypothesize that such non-semantics-preserving (and resultingly ambiguous) adversarial data harm the robustness of the target models. To mitigate such unintended semantic changes of adversarial examples, we propose semantics-preserving adversarial training (SPAT) which encourages perturbation on the pixels that are shared among all classes when generating adversarial examples in the training stage. Experiment results show that SPAT improves adversarial robustness and achieves state-of-the-art results in CIFAR-10 and CIFAR-100.

READ FULL TEXT

page 3

page 6

page 7

research
05/15/2019

On Norm-Agnostic Robustness of Adversarial Training

Adversarial examples are carefully perturbed in-puts for fooling machine...
research
02/16/2023

Masking and Mixing Adversarial Training

While convolutional neural networks (CNNs) have achieved excellent perfo...
research
05/01/2019

Dropping Pixels for Adversarial Robustness

Deep neural networks are vulnerable against adversarial examples. In thi...
research
05/13/2018

Curriculum Adversarial Training

Recently, deep learning has been applied to many security-sensitive appl...
research
06/01/2023

Constructing Semantics-Aware Adversarial Examples with Probabilistic Perspective

In this study, we introduce a novel, probabilistic viewpoint on adversar...
research
05/16/2023

Releasing Inequality Phenomena in L_∞-Adversarial Training via Input Gradient Distillation

Since adversarial examples appeared and showed the catastrophic degradat...
research
09/06/2018

Adversarial Over-Sensitivity and Over-Stability Strategies for Dialogue Models

We present two categories of model-agnostic adversarial strategies that ...

Please sign up or login with your details

Forgot password? Click here to reset