Understanding Adversarial Robustness Against On-manifold Adversarial Examples

10/02/2022
by   Jiancong Xiao, et al.
0

Deep neural networks (DNNs) are shown to be vulnerable to adversarial examples. A well-trained model can be easily attacked by adding small perturbations to the original data. One of the hypotheses of the existence of the adversarial examples is the off-manifold assumption: adversarial examples lie off the data manifold. However, recent research showed that on-manifold adversarial examples also exist. In this paper, we revisit the off-manifold assumption and want to study a question: at what level is the poor performance of neural networks against adversarial attacks due to on-manifold adversarial examples? Since the true data manifold is unknown in practice, we consider two approximated on-manifold adversarial examples on both real and synthesis datasets. On real datasets, we show that on-manifold adversarial examples have greater attack rates than off-manifold adversarial examples on both standard-trained and adversarially-trained models. On synthetic datasets, theoretically, We prove that on-manifold adversarial examples are powerful, yet adversarial training focuses on off-manifold directions and ignores the on-manifold adversarial examples. Furthermore, we provide analysis to show that the properties derived theoretically can also be observed in practice. Our analysis suggests that on-manifold adversarial examples are important, and we should pay more attention to on-manifold adversarial examples for training robust models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/03/2018

Disentangling Adversarial Robustness and Generalization

Obtaining deep networks that are robust against adversarial examples and...
research
06/18/2021

The Dimpled Manifold Model of Adversarial Examples in Machine Learning

The extreme fragility of deep neural networks when presented with tiny p...
research
01/15/2019

The Limitations of Adversarial Training and the Blind-Spot Attack

The adversarial training procedure proposed by Madry et al. (2018) is on...
research
06/02/2018

Idealised Bayesian Neural Networks Cannot Have Adversarial Examples: Theoretical and Empirical Study

We prove that idealised discriminative Bayesian neural networks, capturi...
research
04/25/2022

When adversarial examples are excusable

Neural networks work remarkably well in practice and theoretically they ...
research
02/26/2018

Retrieval-Augmented Convolutional Neural Networks for Improved Robustness against Adversarial Examples

We propose a retrieval-augmented convolutional network and propose to tr...
research
05/06/2019

Adversarial Examples Are Not Bugs, They Are Features

Adversarial examples have attracted significant attention in machine lea...

Please sign up or login with your details

Forgot password? Click here to reset