Wasserstein Smoothing: Certified Robustness against Wasserstein Adversarial Attacks

10/23/2019
by   Alexander Levine, et al.
0

In the last couple of years, several adversarial attack methods based on different threat models have been proposed for the image classification problem. Most existing defenses consider additive threat models in which sample perturbations have bounded L_p norms. These defenses, however, can be vulnerable against adversarial attacks under non-additive threat models. An example of an attack method based on a non-additive threat model is the Wasserstein adversarial attack proposed by Wong et al. (2019), where the distance between an image and its adversarial example is determined by the Wasserstein metric ("earth-mover distance") between their normalized pixel intensities. Until now, there has been no certifiable defense against this type of attack. In this work, we propose the first defense with certified robustness against Wasserstein Adversarial attacks using randomized smoothing. We develop this certificate by considering the space of possible flows between images, and representing this space such that Wasserstein distance between images is upper-bounded by L_1 distance in this flow-space. We can then apply existing randomized smoothing certificates for the L_1 metric. In MNIST and CIFAR-10 datasets, we find that our proposed defense is also practically effective, demonstrating significantly improved accuracy under Wasserstein adversarial attack compared to unprotected models.

READ FULL TEXT
research
06/22/2020

Perceptual Adversarial Robustness: Defense Against Unseen Threat Models

We present adversarial attacks and defenses for the perceptual adversari...
research
08/06/2020

Stronger and Faster Wasserstein Adversarial Attacks

Deep models, while being extremely flexible and accurate, are surprising...
research
12/10/2021

Preemptive Image Robustification for Protecting Users against Man-in-the-Middle Adversarial Attacks

Deep neural networks have become the driving force of modern image recog...
research
02/21/2019

Wasserstein Adversarial Examples via Projected Sinkhorn Iterations

A rapidly growing area of work has studied the existence of adversarial ...
research
10/13/2021

A Framework for Verification of Wasserstein Adversarial Robustness

Machine learning image classifiers are susceptible to adversarial and co...
research
03/16/2022

Provable Adversarial Robustness for Fractional Lp Threat Models

In recent years, researchers have extensively studied adversarial robust...
research
04/26/2020

Improved Image Wasserstein Attacks and Defenses

Robustness against image perturbations bounded by a ℓ_p ball have been w...

Please sign up or login with your details

Forgot password? Click here to reset