Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness

05/30/2023
by   Suraj Srinivas, et al.
0

One of the remarkable properties of robust computer vision models is that their input-gradients are often aligned with human perception, referred to in the literature as perceptually-aligned gradients (PAGs). Despite only being trained for classification, PAGs cause robust models to have rudimentary generative capabilities, including image generation, denoising, and in-painting. However, the underlying mechanisms behind these phenomena remain unknown. In this work, we provide a first explanation of PAGs via off-manifold robustness, which states that models must be more robust off- the data manifold than they are on-manifold. We first demonstrate theoretically that off-manifold robustness leads input gradients to lie approximately on the data manifold, explaining their perceptual alignment. We then show that Bayes optimal models satisfy off-manifold robustness, and confirm the same empirically for robust models trained via gradient norm regularization, noise augmentation, and randomized smoothing. Quantifying the perceptual alignment of model gradients via their similarity with the gradients of generative models, we show that off-manifold robustness correlates well with perceptual alignment. Finally, based on the levels of on- and off-manifold robustness, we identify three different regimes of robustness that affect both perceptual alignment and model accuracy: weak robustness, bayes-aligned robustness, and excessive robustness.

READ FULL TEXT

page 2

page 15

page 16

page 17

page 18

page 19

page 20

page 21

research
06/15/2022

The Manifold Hypothesis for Gradient-Based Explanations

When do gradient-based explanation algorithms provide meaningful explana...
research
10/18/2019

Are Perceptually-Aligned Gradients a General Property of Robust Classifiers?

For a standard convolutional neural network, optimizing over the input p...
research
07/22/2022

Do Perceptually Aligned Gradients Imply Adversarial Robustness?

In the past decade, deep learning-based networks have achieved unprecede...
research
11/29/2021

Exploring Alignment of Representations with Human Perception

We argue that a valuable perspective on when a model learns good represe...
research
06/29/2023

CLIPAG: Towards Generator-Free Text-to-Image Generation

Perceptually Aligned Gradients (PAG) refer to an intriguing property obs...
research
04/03/2022

Adversarially robust segmentation models learn perceptually-aligned gradients

The effects of adversarial training on semantic segmentation networks ha...
research
06/16/2020

Gradient Alignment in Deep Neural Networks

One cornerstone of interpretable deep learning is the high degree of vis...

Please sign up or login with your details

Forgot password? Click here to reset