Perturbation Analysis of Gradient-based Adversarial Attacks

06/02/2020
by   Utku Ozbulak, et al.
17

After the discovery of adversarial examples and their adverse effects on deep learning models, many studies focused on finding more diverse methods to generate these carefully crafted samples. Although empirical results on the effectiveness of adversarial example generation methods against defense mechanisms are discussed in detail in the literature, an in-depth study of the theoretical properties and the perturbation effectiveness of these adversarial attacks has largely been lacking. In this paper, we investigate the objective functions of three popular methods for adversarial example generation: the L-BFGS attack, the Iterative Fast Gradient Sign attack, and Carlini Wagner's attack (CW). Specifically, we perform a comparative and formal analysis of the loss functions underlying the aforementioned attacks while laying out large-scale experimental results on ImageNet dataset. This analysis exposes (1) the faster optimization speed as well as the constrained optimization space of the cross-entropy loss, (2) the detrimental effects of using the signature of the cross-entropy loss on optimization precision as well as optimization space, and (3) the slow optimization speed of the logit loss in the context of adversariality. Our experiments reveal that the Iterative Fast Gradient Sign attack, which is thought to be fast for generating adversarial examples, is the worst attack in terms of the number of iterations required to create adversarial examples in the setting of equal perturbation. Moreover, our experiments show that the underlying loss function of CW, which is criticized for being substantially slower than other adversarial attacks, is not that much slower than other loss functions. Finally, we analyze how well neural networks can identify adversarial perturbations generated by the attacks under consideration, hereby revisiting the idea of adversarial retraining on ImageNet.

READ FULL TEXT

page 1

page 7

page 12

research
07/07/2020

Regional Image Perturbation Reduces L_p Norms of Adversarial Examples While Maintaining Model-to-model Transferability

Regional adversarial attacks often rely on complicated methods for gener...
research
05/19/2022

On Trace of PGD-Like Adversarial Attacks

Adversarial attacks pose safety and security concerns for deep learning ...
research
12/19/2019

Mitigating large adversarial perturbations on X-MAS (X minus Moving Averaged Samples)

We propose the scheme that mitigates an adversarial perturbation ϵ on th...
research
08/15/2022

A Multi-objective Memetic Algorithm for Auto Adversarial Attack Optimization Design

The phenomenon of adversarial examples has been revealed in variant scen...
research
03/09/2023

Decision-BADGE: Decision-based Adversarial Batch Attack with Directional Gradient Estimation

The vulnerability of deep neural networks to adversarial examples has le...
research
07/01/2020

Determining Sequence of Image Processing Technique (IPT) to Detect Adversarial Attacks

Developing secure machine learning models from adversarial examples is c...
research
12/02/2021

A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space

The generation of feasible adversarial examples is necessary for properl...

Please sign up or login with your details

Forgot password? Click here to reset