Using Learning Dynamics to Explore the Role of Implicit Regularization in Adversarial Examples

06/19/2020
by   Josue Ortega Caro, et al.
9

Recent work (Ilyas et al, 2019) suggests that adversarial examples are features not bugs. If adversarial perturbations are indeed useful but non-robust features, then what is their origin? In order to answer these questions, we systematically examine the learning dynamics of adversarial perturbations both in the pixel and frequency domains. We find that: (1) adversarial examples are not present at initialization but instead emerge very early in training, typically within the first epochs, as verified by a novel breakpoint-based analysis; (2) the low-amplitude high-frequency nature of common adversarial perturbations in natural images is critically dependent on an implicit bias towards sparsity in the frequency domain; and (3) the origin of this bias is the locality and translation invariance of convolutional filters, along with (4) the existence of useful frequency-domain features in natural images. We provide a simple theoretical explanation for these observations, providing a clear and minimalist target for theorists in future work. Looking forward, our findings suggest that analyzing the learning dynamics of perturbations can provide useful insights for understanding the origin of adversarial sensitivities and developing robust solutions.

READ FULL TEXT

page 6

page 11

page 12

page 13

page 14

research
10/26/2021

A Frequency Perspective of Adversarial Robustness

Adversarial examples pose a unique challenge for deep learning systems. ...
research
02/09/2021

Adversarial Perturbations Are Not So Weird: Entanglement of Robust and Non-Robust Features in Neural Network Classifiers

Neural networks trained on visual data are well-known to be vulnerable t...
research
06/18/2021

The Dimpled Manifold Model of Adversarial Examples in Machine Learning

The extreme fragility of deep neural networks when presented with tiny p...
research
01/02/2018

High Dimensional Spaces, Deep Learning and Adversarial Examples

In this paper, we analyze deep learning from a mathematical point of vie...
research
02/28/2019

On the Effectiveness of Low Frequency Perturbations

Carefully crafted, often imperceptible, adversarial perturbations have b...
research
12/02/2020

From a Fourier-Domain Perspective on Adversarial Examples to a Wiener Filter Defense for Semantic Segmentation

Despite recent advancements, deep neural networks are not robust against...
research
02/09/2022

Gradient Methods Provably Converge to Non-Robust Networks

Despite a great deal of research, it is still unclear why neural network...

Please sign up or login with your details

Forgot password? Click here to reset