Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples

06/18/2021
by   Maura Pintor, et al.
25

Evaluating robustness of machine-learning models to adversarial examples is a challenging problem. Many defenses have been shown to provide a false sense of security by causing gradient-based attacks to fail, and they have been broken under more rigorous evaluations. Although guidelines and best practices have been suggested to improve current adversarial robustness evaluations, the lack of automatic testing and debugging tools makes it difficult to apply these recommendations in a systematic manner. In this work, we overcome these limitations by (i) defining a set of quantitative indicators which unveil common failures in the optimization of gradient-based attacks, and (ii) proposing specific mitigation strategies within a systematic evaluation protocol. Our extensive experimental analysis shows that the proposed indicators of failure can be used to visualize, debug and improve current adversarial robustness evaluations, providing a first concrete step towards automatizing and systematizing current adversarial robustness evaluations. Our open-source code is available at: https://github.com/pralab/IndicatorsOfAttackFailure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2018

Logit Pairing Methods Can Fool Gradient-Based Attacks

Recently, several logit regularization methods have been proposed in [Ka...
research
10/19/2020

RobustBench: a standardized adversarial robustness benchmark

Evaluation of adversarial robustness is often error-prone leading to ove...
research
02/18/2019

On Evaluating Adversarial Robustness

Correctly evaluating defenses against adversarial examples has proven to...
research
02/01/2018

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

We identify obfuscated gradients as a phenomenon that leads to a false s...
research
03/14/2020

Minimum-Norm Adversarial Examples on KNN and KNN-Based Models

We study the robustness against adversarial examples of kNN classifiers ...
research
11/23/2022

Reliable Robustness Evaluation via Automatically Constructed Attack Ensembles

Attack Ensemble (AE), which combines multiple attacks together, provides...
research
06/10/2020

Evaluating Graph Vulnerability and Robustness using TIGER

The study of network robustness is a critical tool in the characterizati...

Please sign up or login with your details

Forgot password? Click here to reset