Interpolated Joint Space Adversarial Training for Robust and Generalizable Defenses

12/12/2021
by   Chun Pong Lau, et al.
1

Adversarial training (AT) is considered to be one of the most reliable defenses against adversarial attacks. However, models trained with AT sacrifice standard accuracy and do not generalize well to novel attacks. Recent works show generalization improvement with adversarial samples under novel threat models such as on-manifold threat model or neural perceptual threat model. However, the former requires exact manifold information while the latter requires algorithm relaxation. Motivated by these considerations, we exploit the underlying manifold information with Normalizing Flow, ensuring that exact manifold assumption holds. Moreover, we propose a novel threat model called Joint Space Threat Model (JSTM), which can serve as a special case of the neural perceptual threat model that does not require additional relaxation to craft the corresponding adversarial attacks. Under JSTM, we develop novel adversarial attacks and defenses. The mixup strategy improves the standard accuracy of neural networks but sacrifices robustness when combined with AT. To tackle this issue, we propose the Robust Mixup strategy in which we maximize the adversity of the interpolated images and gain robustness and prevent overfitting. Our experiments show that Interpolated Joint Space Adversarial Training (IJSAT) achieves good performance in standard accuracy, robustness, and generalization in CIFAR-10/100, OM-ImageNet, and CIFAR-10-C datasets. IJSAT is also flexible and can be used as a data augmentation method to improve standard accuracy and combine with many existing AT approaches to improve robustness.

READ FULL TEXT

page 1

page 2

page 8

page 12

page 14

page 15

page 16

research
06/22/2020

Perceptual Adversarial Robustness: Defense Against Unseen Threat Models

We present adversarial attacks and defenses for the perceptual adversari...
research
09/05/2020

Dual Manifold Adversarial Robustness: Defense against Lp and non-Lp Adversarial Attacks

Adversarial training is a popular defense strategy against attack threat...
research
05/17/2023

Raising the Bar for Certified Adversarial Robustness with Diffusion Models

Certified defenses against adversarial attacks offer formal guarantees o...
research
10/01/2022

On the tightness of linear relaxation based robustness certification methods

There has been a rapid development and interest in adversarial training ...
research
03/03/2023

Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models

While adversarial training has been extensively studied for ResNet archi...
research
02/01/2021

Fast Training of Provably Robust Neural Networks by SingleProp

Recent works have developed several methods of defending neural networks...
research
05/31/2021

Adaptive Feature Alignment for Adversarial Training

Recent studies reveal that Convolutional Neural Networks (CNNs) are typi...

Please sign up or login with your details

Forgot password? Click here to reset