The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for Improving Adversarial Training

11/01/2022
by   Junhao Dong, et al.
0

Although current deep learning techniques have yielded superior performance on various computer vision tasks, yet they are still vulnerable to adversarial examples. Adversarial training and its variants have been shown to be the most effective approaches to defend against adversarial examples. These methods usually regularize the difference between output probabilities for an adversarial and its corresponding natural example. However, it may have a negative impact if the model misclassifies a natural example. To circumvent this issue, we propose a novel adversarial training scheme that encourages the model to produce similar outputs for an adversarial example and its “inverse adversarial” counterpart. These samples are generated to maximize the likelihood in the neighborhood of natural examples. Extensive experiments on various vision datasets and architectures demonstrate that our training method achieves state-of-the-art robustness as well as natural accuracy. Furthermore, using a universal version of inverse adversarial examples, we improve the performance of single-step adversarial training techniques at a low computational cost.

READ FULL TEXT
research
02/16/2023

Masking and Mixing Adversarial Training

While convolutional neural networks (CNNs) have achieved excellent perfo...
research
04/26/2023

Generating Adversarial Examples with Task Oriented Multi-Objective Optimization

Deep learning models, even the-state-of-the-art ones, are highly vulnera...
research
10/15/2020

A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack and Learning

Although deep convolutional neural networks (CNNs) have demonstrated rem...
research
11/15/2019

On Model Robustness Against Adversarial Examples

We study the model robustness against adversarial examples, referred to ...
research
07/14/2023

Vulnerability-Aware Instance Reweighting For Adversarial Training

Adversarial Training (AT) has been found to substantially improve the ro...
research
06/15/2022

Fast and Reliable Evaluation of Adversarial Robustness with Minimum-Margin Attack

The AutoAttack (AA) has been the most reliable method to evaluate advers...
research
03/01/2017

Generating Steganographic Images via Adversarial Training

Adversarial training was recently shown to be competitive against superv...

Please sign up or login with your details

Forgot password? Click here to reset