Improved Network Robustness with Adversary Critic

10/30/2018
by   Alexander Matyasko, et al.
0

Ideally, what confuses neural network should be confusing to humans. However, recent experiments have shown that small, imperceptible perturbations can change the network prediction. To address this gap in perception, we propose a novel approach for learning robust classifier. Our main idea is: adversarial examples for the robust classifier should be indistinguishable from the regular data of the adversarial target. We formulate a problem of learning robust classifier in the framework of Generative Adversarial Networks (GAN), where the adversarial attack on classifier acts as a generator, and the critic network learns to distinguish between regular and adversarial images. The classifier cost is augmented with the objective that its adversarial examples should confuse the adversary critic. To improve the stability of the adversarial mapping, we introduce adversarial cycle-consistency constraint which ensures that the adversarial mapping of the adversarial examples is close to the original. In the experiments, we show the effectiveness of our defense. Our method surpasses in terms of robustness networks trained with adversarial training. Additionally, we verify in the experiments with human annotators on MTurk that adversarial examples are indeed visually confusing. Codes for the project are available at https://github.com/aam-at/adversary_critic.

READ FULL TEXT

page 4

page 8

research
05/09/2017

Generative Adversarial Trainer: Defense to Adversarial Perturbations with GAN

We propose a novel technique to make neural network robust to adversaria...
research
10/20/2022

Balanced Adversarial Training: Balancing Tradeoffs between Fickleness and Obstinacy in NLP Models

Traditional (fickle) adversarial examples involve finding a small pertur...
research
09/17/2019

HAD-GAN: A Human-perception Auxiliary Defense GAN model to Defend Adversarial Examples

Adversarial examples reveal the vulnerability and unexplained nature of ...
research
05/21/2018

Bidirectional Learning for Robust Neural Networks

A multilayer perceptron can behave as a generative classifier by applyin...
research
07/14/2021

AID-Purifier: A Light Auxiliary Network for Boosting Adversarial Defense

We propose an AID-purifier that can boost the robustness of adversariall...
research
03/22/2021

Grey-box Adversarial Attack And Defence For Sentiment Classification

We introduce a grey-box adversarial attack and defence framework for sen...
research
01/22/2023

Provable Unrestricted Adversarial Training without Compromise with Generalizability

Adversarial training (AT) is widely considered as the most promising str...

Please sign up or login with your details

Forgot password? Click here to reset