Adversarial Risk and the Dangers of Evaluating Against Weak Attacks

02/15/2018
by   Jonathan Uesato, et al.
0

This paper investigates recently proposed approaches for defending against adversarial examples and evaluating adversarial robustness. The existence of adversarial examples in trained neural networks reflects the fact that expected risk alone does not capture the model's performance against worst-case inputs. We motivate the use of adversarial risk as an objective, although it cannot easily be computed exactly. We then frame commonly used attacks and evaluation metrics as defining a tractable surrogate objective to the true adversarial risk. This suggests that models may be obscured to adversaries, by optimizing this surrogate rather than the true adversarial risk. We demonstrate that this is a significant problem in practice by repurposing gradient-free optimization techniques into adversarial attacks, which we use to decrease the accuracy of several recently proposed defenses to near zero. Our hope is that our formulations and results will help researchers to develop more powerful defenses.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2019

On Evaluating Adversarial Robustness

Correctly evaluating defenses against adversarial examples has proven to...
research
09/22/2018

Unrestricted Adversarial Examples

We introduce a two-player contest for evaluating the safety and robustne...
research
01/07/2021

Understanding the Error in Evaluating Adversarial Robustness

Deep neural networks are easily misled by adversarial examples. Although...
research
01/28/2023

Selecting Models based on the Risk of Damage Caused by Adversarial Attacks

Regulation, legal liabilities, and societal concerns challenge the adopt...
research
01/10/2022

Evaluation of Neural Networks Defenses and Attacks using NDCG and Reciprocal Rank Metrics

The problem of attacks on neural networks through input modification (i....
research
12/10/2019

Statistically Robust Neural Network Classification

Recently there has been much interest in quantifying the robustness of n...
research
11/08/2018

New CleverHans Feature: Better Adversarial Robustness Evaluations with Attack Bundling

This technical report describes a new feature of the CleverHans library ...

Please sign up or login with your details

Forgot password? Click here to reset