Wasserstein distributional robustness of neural networks

06/16/2023
by   Xingjian Bai, et al.
0

Deep neural networks are known to be vulnerable to adversarial attacks (AA). For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified. Design of such attacks as well as methods of adversarial training against them are subject of intense research. We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions leveraging recent insights from DRO sensitivity analysis. We consider a set of distributional threat models. Unlike the traditional pointwise attacks, which assume a uniform bound on perturbation of each input data point, distributional threat models allow attackers to perturb inputs in a non-uniform way. We link these more general attacks with questions of out-of-sample performance and Knightian uncertainty. To evaluate the distributional robustness of neural networks, we propose a first-order AA algorithm and its multi-step version. Our attack algorithms include Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) as special cases. Furthermore, we provide a new asymptotic estimate of the adversarial accuracy against distributional threat models. The bound is fast to compute and first-order accurate, offering new insights even for the pointwise AA. It also naturally yields out-of-sample performance guarantees. We conduct numerical experiments on the CIFAR-10 dataset using DNNs on RobustBench to illustrate our theoretical results. Our code is available at https://github.com/JanObloj/W-DRO-Adversarial-Methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2022

A New Kind of Adversarial Example

Almost all adversarial attacks are formulated to add an imperceptible pe...
research
02/27/2022

A Unified Wasserstein Distributional Robustness Framework for Adversarial Training

It is well-known that deep neural networks (DNNs) are susceptible to adv...
research
01/04/2023

Beckman Defense

Optimal transport (OT) based distributional robust optimisation (DRO) ha...
research
03/01/2021

Mind the box: l_1-APGD for sparse adversarial attacks on image classifiers

We show that when taking into account also the image domain [0,1]^d, est...
research
02/21/2019

Wasserstein Adversarial Examples via Projected Sinkhorn Iterations

A rapidly growing area of work has studied the existence of adversarial ...
research
04/26/2020

Improved Image Wasserstein Attacks and Defenses

Robustness against image perturbations bounded by a ℓ_p ball have been w...
research
08/06/2020

Stronger and Faster Wasserstein Adversarial Attacks

Deep models, while being extremely flexible and accurate, are surprising...

Please sign up or login with your details

Forgot password? Click here to reset