On Evaluating Adversarial Robustness

by   Nicholas Carlini, et al.

Correctly evaluating defenses against adversarial examples has proven to be extremely difficult. Despite the significant amount of recent work attempting to design defenses that withstand adaptive attacks, few have succeeded; most papers that propose defenses are quickly shown to be incorrect. We believe a large contributing factor is the difficulty of performing security evaluations. In this paper, we discuss the methodological foundations, review commonly accepted best practices, and suggest new methods for evaluating defenses to adversarial examples. We hope that both researchers developing defenses as well as readers and reviewers who wish to understand the completeness of an evaluation consider our advice in order to avoid common pitfalls.


page 1

page 2

page 3

page 4


On Adaptive Attacks to Adversarial Example Defenses

Adaptive attacks have (rightfully) become the de facto standard for eval...

Adversarial Risk and the Dangers of Evaluating Against Weak Attacks

This paper investigates recently proposed approaches for defending again...

Increasing Confidence in Adversarial Robustness Evaluations

Hundreds of defenses have been proposed to make deep neural networks rob...

Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples

Evaluating robustness of machine-learning models to adversarial examples...

Evaluation of Neural Networks Defenses and Attacks using NDCG and Reciprocal Rank Metrics

The problem of attacks on neural networks through input modification (i....

Measuring the False Sense of Security

Recently, several papers have demonstrated how widespread gradient maski...

On Need for Topology-Aware Generative Models for Manifold-Based Defenses

ML algorithms or models, especially deep neural networks (DNNs), have sh...

Please sign up or login with your details

Forgot password? Click here to reset