DeepAI AI Chat
Log In Sign Up

Understanding Adversarial Robustness Against On-manifold Adversarial Examples

10/02/2022
by   Jiancong Xiao, et al.
The Chinese University of Hong Kong, Shenzhen
0

Deep neural networks (DNNs) are shown to be vulnerable to adversarial examples. A well-trained model can be easily attacked by adding small perturbations to the original data. One of the hypotheses of the existence of the adversarial examples is the off-manifold assumption: adversarial examples lie off the data manifold. However, recent research showed that on-manifold adversarial examples also exist. In this paper, we revisit the off-manifold assumption and want to study a question: at what level is the poor performance of neural networks against adversarial attacks due to on-manifold adversarial examples? Since the true data manifold is unknown in practice, we consider two approximated on-manifold adversarial examples on both real and synthesis datasets. On real datasets, we show that on-manifold adversarial examples have greater attack rates than off-manifold adversarial examples on both standard-trained and adversarially-trained models. On synthetic datasets, theoretically, We prove that on-manifold adversarial examples are powerful, yet adversarial training focuses on off-manifold directions and ignores the on-manifold adversarial examples. Furthermore, we provide analysis to show that the properties derived theoretically can also be observed in practice. Our analysis suggests that on-manifold adversarial examples are important, and we should pay more attention to on-manifold adversarial examples for training robust models.

READ FULL TEXT

page 1

page 2

page 3

page 4

12/03/2018

Disentangling Adversarial Robustness and Generalization

Obtaining deep networks that are robust against adversarial examples and...
06/18/2021

The Dimpled Manifold Model of Adversarial Examples in Machine Learning

The extreme fragility of deep neural networks when presented with tiny p...
01/15/2019

The Limitations of Adversarial Training and the Blind-Spot Attack

The adversarial training procedure proposed by Madry et al. (2018) is on...
06/02/2018

Idealised Bayesian Neural Networks Cannot Have Adversarial Examples: Theoretical and Empirical Study

We prove that idealised discriminative Bayesian neural networks, capturi...
04/25/2022

When adversarial examples are excusable

Neural networks work remarkably well in practice and theoretically they ...
02/26/2018

Retrieval-Augmented Convolutional Neural Networks for Improved Robustness against Adversarial Examples

We propose a retrieval-augmented convolutional network and propose to tr...
05/06/2019

Adversarial Examples Are Not Bugs, They Are Features

Adversarial examples have attracted significant attention in machine lea...