Evaluating Adversarial Attacks on ImageNet: A Reality Check on Misclassification Classes

11/22/2021
by   Utku Ozbulak, et al.
1

Although ImageNet was initially proposed as a dataset for performance benchmarking in the domain of computer vision, it also enabled a variety of other research efforts. Adversarial machine learning is one such research effort, employing deceptive inputs to fool models in making wrong predictions. To evaluate attacks and defenses in the field of adversarial machine learning, ImageNet remains one of the most frequently used datasets. However, a topic that is yet to be investigated is the nature of the classes into which adversarial examples are misclassified. In this paper, we perform a detailed analysis of these misclassification classes, leveraging the ImageNet class hierarchy and measuring the relative positions of the aforementioned type of classes in the unperturbed origins of the adversarial examples. We find that 71% of the adversarial examples that achieve model-to-model adversarial transferability are misclassified into one of the top-5 classes predicted for the underlying source images. We also find that a large subset of untargeted misclassifications are, in fact, misclassifications into semantically similar classes. Based on these findings, we discuss the need to take into account the ImageNet class hierarchy when evaluating untargeted adversarial successes. Furthermore, we advocate for future research efforts to incorporate categorical information.

READ FULL TEXT

page 3

page 10

page 14

research
06/14/2021

Selection of Source Images Heavily Influences the Effectiveness of Adversarial Attacks

Although the adoption rate of deep neural networks (DNNs) has tremendous...
research
06/21/2021

Adversarial Examples Make Strong Poisons

The adversarial machine learning literature is largely partitioned into ...
research
03/27/2023

Improving the Transferability of Adversarial Examples via Direction Tuning

In the transfer-based adversarial attacks, adversarial examples are only...
research
04/11/2017

The Space of Transferable Adversarial Examples

Adversarial examples are maliciously perturbed inputs designed to mislea...
research
09/08/2020

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Deep Learning algorithms have achieved the state-of-the-art performance ...
research
08/24/2022

Bugs in the Data: How ImageNet Misrepresents Biodiversity

ImageNet-1k is a dataset often used for benchmarking machine learning (M...
research
03/11/2022

The Role of ImageNet Classes in Fréchet Inception Distance

Fréchet Inception Distance (FID) is a metric for quantifying the distanc...

Please sign up or login with your details

Forgot password? Click here to reset