Defending Against Adversarial Attacks by Leveraging an Entire GAN

Recent work has shown that state-of-the-art models are highly vulnerable to adversarial perturbations of the input. We propose cowboy, an approach to detecting and defending against adversarial attacks by using both the discriminator and generator of a GAN trained on the same dataset. We show that the discriminator consistently scores the adversarial samples lower than the real samples across multiple attacks and datasets. We provide empirical evidence that adversarial samples lie outside of the data manifold learned by the GAN. Based on this, we propose a cleaning method which uses both the discriminator and generator of the GAN to project the samples back onto the data manifold. This cleaning procedure is independent of the classifier and type of attack and thus can be deployed in existing systems.

READ FULL TEXT

page 3

page 4

page 7

page 8

research
12/08/2018

AutoGAN: Robust Classifier Against Adversarial Attacks

Classifiers fail to classify correctly input images that have been purpo...
research
04/01/2021

Normal vs. Adversarial: Salience-based Analysis of Adversarial Samples for Relation Extraction

Recent neural-based relation extraction approaches, though achieving pro...
research
02/04/2020

Minimax Defense against Gradient-based Adversarial Attacks

State-of-the-art adversarial attacks are aimed at neural network classif...
research
11/19/2020

Multi-Task Adversarial Attack

Deep neural networks have achieved impressive performance in various are...
research
07/20/2021

Discriminator-Free Generative Adversarial Attack

The Deep Neural Networks are vulnerable toadversarial exam-ples(Figure 1...
research
11/02/2020

Sampling-Decomposable Generative Adversarial Recommender

Recommendation techniques are important approaches for alleviating infor...

Please sign up or login with your details

Forgot password? Click here to reset