Selection of Source Images Heavily Influences the Effectiveness of Adversarial Attacks

by   Utku Ozbulak, et al.

Although the adoption rate of deep neural networks (DNNs) has tremendously increased in recent years, a solution for their vulnerability against adversarial examples has not yet been found. As a result, substantial research efforts are dedicated to fix this weakness, with many studies typically using a subset of source images to generate adversarial examples, treating every image in this subset as equal. We demonstrate that, in fact, not every source image is equally suited for this kind of assessment. To do so, we devise a large-scale model-to-model transferability scenario for which we meticulously analyze the properties of adversarial examples, generated from every suitable source image in ImageNet by making use of two of the most frequently deployed attacks. In this transferability scenario, which involves seven distinct DNN models, including the recently proposed vision transformers, we reveal that it is possible to have a difference of up to 12.5% in model-to-model transferability success, 1.01 in average L_2 perturbation, and 0.03 (8/225) in average L_∞ perturbation when 1,000 source images are sampled randomly among all suitable candidates. We then take one of the first steps in evaluating the robustness of images used to create adversarial examples, proposing a number of simple but effective methods to identify unsuitable source images, thus making it possible to mitigate extreme cases in experimentation and support high-quality benchmarking.


page 4

page 5

page 8

page 13

page 14

page 15

page 23

page 24


Generating Adversarial Examples with Better Transferability via Masking Unimportant Parameters of Surrogate Model

Deep neural networks (DNNs) have been shown to be vulnerable to adversar...

Evaluating Adversarial Attacks on ImageNet: A Reality Check on Misclassification Classes

Although ImageNet was initially proposed as a dataset for performance be...

Improving the Transferability of Adversarial Examples via Direction Tuning

In the transfer-based adversarial attacks, adversarial examples are only...

Towards Robust Neural Image Compression: Adversarial Attack and Model Finetuning

Deep neural network based image compression has been extensively studied...

Regional Image Perturbation Reduces L_p Norms of Adversarial Examples While Maintaining Model-to-model Transferability

Regional adversarial attacks often rely on complicated methods for gener...

Closer Look at the Transferability of Adversarial Examples: How They Fool Different Models Differently

Deep neural networks are vulnerable to adversarial examples (AEs), which...

Generating Textual Adversarial Examples for Deep Learning Models: A Survey

With the development of high computational devices, deep neural networks...

Please sign up or login with your details

Forgot password? Click here to reset