Increasing Confidence in Adversarial Robustness Evaluations

06/28/2022
by   Roland S. Zimmermann, et al.
0

Hundreds of defenses have been proposed to make deep neural networks robust against minimal (adversarial) input perturbations. However, only a handful of these defenses held up their claims because correctly evaluating robustness is extremely challenging: Weak attacks often fail to find adversarial examples even if they unknowingly exist, thereby making a vulnerable network look robust. In this paper, we propose a test to identify weak attacks, and thus weak defense evaluations. Our test slightly modifies a neural network to guarantee the existence of an adversarial example for every sample. Consequentially, any correct attack must succeed in breaking this modified network. For eleven out of thirteen previously-published defenses, the original evaluation of the defense fails our test, while stronger attacks that break these defenses pass it. We hope that attack unit tests - such as ours - will be a major component in future robustness evaluations and increase confidence in an empirical field that is currently riddled with skepticism.

READ FULL TEXT

page 4

page 8

page 9

page 10

page 11

page 12

page 14

page 15

research
02/19/2020

On Adaptive Attacks to Adversarial Example Defenses

Adaptive attacks have (rightfully) become the de facto standard for eval...
research
02/18/2019

On Evaluating Adversarial Robustness

Correctly evaluating defenses against adversarial examples has proven to...
research
10/13/2021

Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness

The vulnerability of deep neural networks to adversarial examples has mo...
research
09/29/2017

Ground-Truth Adversarial Examples

The ability to deploy neural networks in real-world, safety-critical sys...
research
08/13/2022

Confidence Matters: Inspecting Backdoors in Deep Neural Networks via Distribution Transfer

Backdoor attacks have been shown to be a serious security threat against...
research
01/07/2021

Understanding the Error in Evaluating Adversarial Robustness

Deep neural networks are easily misled by adversarial examples. Although...
research
01/24/2021

A Comprehensive Evaluation Framework for Deep Model Robustness

Deep neural networks (DNNs) have achieved remarkable performance across ...

Please sign up or login with your details

Forgot password? Click here to reset