Learning Perceptually-Aligned Representations via Adversarial Robustness

06/03/2019
by   Logan Engstrom, et al.
0

Many applications of machine learning require models that are human-aligned, i.e., that make decisions based on human-meaningful information about the input. We identify the pervasive brittleness of deep networks' learned representations as a fundamental barrier to attaining this goal. We then re-cast robust optimization as a tool for enforcing human priors on the features learned by deep neural networks. The resulting robust feature representations turn out to be significantly more aligned with human perception. We leverage these representations to perform input interpolation, feature manipulation, and sensitivity mapping, without any post-processing or human intervention after model training. Our code and models for reproducing these results is available at https://git.io/robust-reps.

READ FULL TEXT

page 6

page 15

page 17

page 18

page 21

page 22

page 23

page 24

research
07/08/2020

Fast Training of Deep Neural Networks Robust to Adversarial Perturbations

Deep neural networks are capable of training fast and generalizing well ...
research
05/11/2021

Leveraging Sparse Linear Layers for Debuggable Deep Networks

We show how fitting sparse linear models over learned deep feature repre...
research
04/03/2022

Adversarially robust segmentation models learn perceptually-aligned gradients

The effects of adversarial training on semantic segmentation networks ha...
research
11/29/2021

Exploring Alignment of Representations with Human Perception

We argue that a valuable perspective on when a model learns good represe...
research
08/26/2023

Brain-like representational straightening of natural movies in robust feedforward neural networks

Representational straightening refers to a decrease in curvature of visu...
research
06/13/2021

Inverting Adversarially Robust Networks for Image Synthesis

Recent research in adversarially robust classifiers suggests their repre...
research
06/15/2022

Robust Attack Graph Generation

We present a method to learn automaton models that are more robust to in...

Please sign up or login with your details

Forgot password? Click here to reset