The Dimpled Manifold Model of Adversarial Examples in Machine Learning

06/18/2021
by   Adi Shamir, et al.
32

The extreme fragility of deep neural networks when presented with tiny perturbations in their inputs was independently discovered by several research groups in 2013, but in spite of enormous effort these adversarial examples remained a baffling phenomenon with no clear explanation. In this paper we introduce a new conceptual framework (which we call the Dimpled Manifold Model) which provides a simple explanation for why adversarial examples exist, why their perturbations have such tiny norms, why these perturbations look like random noise, and why a network which was adversarially trained with incorrectly labeled images can still correctly classify test images. In the last part of the paper we describe the results of numerous experiments which strongly support this new model, and in particular our assertion that adversarial perturbations are roughly perpendicular to the low dimensional manifold which contains all the training examples.

READ FULL TEXT

page 9

page 11

page 12

page 15

page 16

page 17

page 19

page 20

research
10/02/2022

Understanding Adversarial Robustness Against On-manifold Adversarial Examples

Deep neural networks (DNNs) are shown to be vulnerable to adversarial ex...
research
01/02/2018

High Dimensional Spaces, Deep Learning and Adversarial Examples

In this paper, we analyze deep learning from a mathematical point of vie...
research
06/09/2022

Meet You Halfway: Explaining Deep Learning Mysteries

Deep neural networks perform exceptionally well on various learning task...
research
01/30/2019

A Simple Explanation for the Existence of Adversarial Examples with Small Hamming Distance

The existence of adversarial examples in which an imperceptible change i...
research
07/13/2020

Understanding Adversarial Examples from the Mutual Influence of Images and Perturbations

A wide variety of works have explored the reason for the existence of ad...
research
06/19/2020

Using Learning Dynamics to Explore the Role of Implicit Regularization in Adversarial Examples

Recent work (Ilyas et al, 2019) suggests that adversarial examples are f...
research
01/09/2018

Adversarial Spheres

State of the art computer vision models have been shown to be vulnerable...

Please sign up or login with your details

Forgot password? Click here to reset