There Is No Free Lunch In Adversarial Robustness (But There Are Unexpected Benefits)

05/30/2018
by   Dimitris Tsipras, et al.
2

We provide a new understanding of the fundamental nature of adversarially robust classifiers and how they differ from standard models. In particular, we show that there provably exists a trade-off between the standard accuracy of a model and its robustness to adversarial perturbations. We demonstrate an intriguing phenomenon at the root of this tension: a certain dichotomy between "robust" and "non-robust" features. We show that while robustness comes at a price, it also has some surprising benefits. Robust models turn out to have interpretable gradients and feature representations that align unusually well with salient data characteristics. In fact, they yield striking feature interpolations that have thus far been possible to obtain only using generative models such as GANs.

READ FULL TEXT

page 6

page 7

page 8

page 17

page 18

page 19

research
02/21/2020

Robustness from Simple Classifiers

Despite the vast success of Deep Neural Networks in numerous application...
research
05/19/2019

What Do Adversarially Robust Models Look At?

In this paper, we address the open question: "What do adversarially robu...
research
01/02/2019

Adversarial Robustness May Be at Odds With Simplicity

Current techniques in machine learning are so far are unable to learn cl...
research
02/09/2021

Adversarial Perturbations Are Not So Weird: Entanglement of Robust and Non-Robust Features in Neural Network Classifiers

Neural networks trained on visual data are well-known to be vulnerable t...
research
07/22/2022

Do Perceptually Aligned Gradients Imply Adversarial Robustness?

In the past decade, deep learning-based networks have achieved unprecede...
research
10/22/2021

Adversarial robustness for latent models: Revisiting the robust-standard accuracies tradeoff

Over the past few years, several adversarial training methods have been ...
research
05/19/2021

Balancing Robustness and Sensitivity using Feature Contrastive Learning

It is generally believed that robust training of extremely large network...

Please sign up or login with your details

Forgot password? Click here to reset