Identifying Classes Susceptible to Adversarial Attacks

05/30/2019
by   Rangeet Pan, et al.
0

Despite numerous attempts to defend deep learning based image classifiers, they remain susceptible to the adversarial attacks. This paper proposes a technique to identify susceptible classes, those classes that are more easily subverted. To identify the susceptible classes we use distance-based measures and apply them on a trained model. Based on the distance among original classes, we create mapping among original classes and adversarial classes that helps to reduce the randomness of a model to a significant amount in an adversarial setting. We analyze the high dimensional geometry among the feature classes and identify the k most susceptible target classes in an adversarial attack. We conduct experiments using MNIST, Fashion MNIST, CIFAR-10 (ImageNet and ResNet-32) datasets. Finally, we evaluate our techniques in order to determine which distance-based measure works best and how the randomness of a model changes with perturbation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2023

Multi-attacks: Many images + the same adversarial attack → many target labels

We show that we can easily design a single adversarial perturbation P th...
research
03/07/2021

Universal Adversarial Perturbations and Image Spam Classifiers

As the name suggests, image spam is spam email that has been embedded in...
research
02/05/2021

Optimal Transport as a Defense Against Adversarial Attacks

Deep learning classifiers are now known to have flaws in the representat...
research
09/02/2021

Impact of Attention on Adversarial Robustness of Image Classification Models

Adversarial attacks against deep learning models have gained significant...
research
10/29/2020

Beyond cross-entropy: learning highly separable feature distributions for robust and accurate classification

Deep learning has shown outstanding performance in several applications ...
research
11/24/2020

Stochastic sparse adversarial attacks

Adversarial attacks of neural network classifiers (NNC) and the use of r...
research
11/28/2022

Attack on Unfair ToS Clause Detection: A Case Study using Universal Adversarial Triggers

Recent work has demonstrated that natural language processing techniques...

Please sign up or login with your details

Forgot password? Click here to reset