Defense against Universal Adversarial Perturbations

11/16/2017
by   Naveed Akhtar, et al.
0

Recent advances in Deep Learning show the existence of image-agnostic quasi-imperceptible perturbations that when applied to `any' image can fool a state-of-the-art network classifier to change its prediction about the image label. These `Universal Adversarial Perturbations' pose a serious threat to the success of Deep Learning in practice. We present the first dedicated framework to effectively defend the networks against such perturbations. Our approach learns a Perturbation Rectifying Network (PRN) as `pre-input' layers to a targeted model, such that the targeted model needs no modification. The PRN is learned from real and synthetic image-agnostic perturbations, where an efficient method to compute the latter is also proposed. A perturbation detector is separately trained on the Discrete Cosine Transform of the input-output difference of the PRN. A query image is first passed through the PRN and verified by the detector. If a perturbation is detected, the output of the PRN is used for label prediction instead of the actual image. A rigorous evaluation shows that our framework can defend the network classifiers against unseen adversarial perturbations in the real-world scenarios with up to 97.5 success rate. The PRN also generalizes well in the sense that training for one targeted network defends another network with a comparable success rate.

READ FULL TEXT

page 1

page 5

page 7

research
12/08/2020

Locally optimal detection of stochastic targeted universal adversarial perturbations

Deep learning image classifiers are known to be vulnerable to small adve...
research
10/10/2019

Universal Adversarial Perturbation for Text Classification

Given a state-of-the-art deep neural network text classifier, we show th...
research
11/30/2018

Adversarial Defense by Stratified Convolutional Sparse Coding

We propose an adversarial defense method that achieves state-of-the-art ...
research
10/28/2020

Transferable Universal Adversarial Perturbations Using Generative Models

Deep neural networks tend to be vulnerable to adversarial perturbations,...
research
07/19/2020

Connecting the Dots: Detecting Adversarial Perturbations Using Context Inconsistency

There has been a recent surge in research on adversarial perturbations t...
research
08/03/2018

Ask, Acquire, and Attack: Data-free UAP Generation using Class Impressions

Deep learning models are susceptible to input specific noise, called adv...
research
02/20/2018

On Lyapunov exponents and adversarial perturbation

In this paper, we would like to disseminate a serendipitous discovery in...

Please sign up or login with your details

Forgot password? Click here to reset