Releasing Inequality Phenomena in L_∞-Adversarial Training via Input Gradient Distillation

05/16/2023
by   Junxi Chen, et al.
0

Since adversarial examples appeared and showed the catastrophic degradation they brought to DNN, many adversarial defense methods have been devised, among which adversarial training is considered the most effective. However, a recent work showed the inequality phenomena in l_∞-adversarial training and revealed that the l_∞-adversarially trained model is vulnerable when a few important pixels are perturbed by i.i.d. noise or occluded. In this paper, we propose a simple yet effective method called Input Gradient Distillation (IGD) to release the inequality phenomena in l_∞-adversarial training. Experiments show that while preserving the model's adversarial robustness, compared to PGDAT, IGD decreases the l_∞-adversarially trained model's error rate to inductive noise and inductive occlusion by up to 60% and 16.53%, and to noisy images in Imagenet-C by up to 21.11%. Moreover, we formally explain why the equality of the model's saliency map can improve such robustness.

READ FULL TEXT

page 4

page 20

research
09/23/2020

Semantics-Preserving Adversarial Training

Adversarial training is a defense technique that improves adversarial ro...
research
05/29/2021

Analysis and Applications of Class-wise Robustness in Adversarial Training

Adversarial training is one of the most effective approaches to improve ...
research
02/21/2022

Transferring Adversarial Robustness Through Robust Representation Matching

With the widespread use of machine learning, concerns over its security ...
research
06/05/2022

Vanilla Feature Distillation for Improving the Accuracy-Robustness Trade-Off in Adversarial Training

Adversarial training has been widely explored for mitigating attacks aga...
research
07/21/2022

Towards Efficient Adversarial Training on Vision Transformers

Vision Transformer (ViT), as a powerful alternative to Convolutional Neu...
research
03/19/2021

Noise Modulation: Let Your Model Interpret Itself

Given the great success of Deep Neural Networks(DNNs) and the black-box ...
research
05/22/2022

AutoJoin: Efficient Adversarial Training for Robust Maneuvering via Denoising Autoencoder and Joint Learning

As a result of increasingly adopted machine learning algorithms and ubiq...

Please sign up or login with your details

Forgot password? Click here to reset