Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks

12/28/2021
by   Weiran Lin, et al.
139

Minimal adversarial perturbations added to inputs have been shown to be effective at fooling deep neural networks. In this paper, we introduce several innovations that make white-box targeted attacks follow the intuition of the attacker's goal: to trick the model to assign a higher probability to the target class than to any other, while staying within a specified distance from the original input. First, we propose a new loss function that explicitly captures the goal of targeted attacks, in particular, by using the logits of all classes instead of just a subset, as is common. We show that Auto-PGD with this loss function finds more adversarial examples than it does with other commonly used loss functions. Second, we propose a new attack method that uses a further developed version of our loss function capturing both the misclassification objective and the L_∞ distance limit ϵ. This new attack method is relatively 1.5–4.2 dataset and relatively 8.2–14.9 the next best state-of-the-art attack. We confirm using statistical tests that our attack outperforms state-of-the-art attacks on different datasets and values of ϵ and against different defenses.

READ FULL TEXT

page 7

page 11

page 18

page 19

page 20

page 21

research
03/09/2023

Decision-BADGE: Decision-based Adversarial Batch Attack with Directional Gradient Estimation

The vulnerability of deep neural networks to adversarial examples has le...
research
06/02/2023

Adversarial Attack Based on Prediction-Correction

Deep neural networks (DNNs) are vulnerable to adversarial examples obtai...
research
05/21/2020

Inaudible Adversarial Perturbations for Targeted Attack in Speaker Recognition

Speaker recognition is a popular topic in biometric authentication and m...
research
02/15/2022

Unreasonable Effectiveness of Last Hidden Layer Activations

In standard Deep Neural Network (DNN) based classifiers, the general con...
research
03/10/2020

SAD: Saliency-based Defenses Against Adversarial Examples

With the rise in popularity of machine and deep learning models, there i...
research
11/09/2021

A Statistical Difference Reduction Method for Escaping Backdoor Detection

Recent studies show that Deep Neural Networks (DNNs) are vulnerable to b...
research
02/21/2020

Adversarial Detection and Correction by Matching Prediction Distributions

We present a novel adversarial detection and correction method for machi...

Please sign up or login with your details

Forgot password? Click here to reset