Explicit Tradeoffs between Adversarial and Natural Distributional Robustness

09/15/2022
by   Mazda Moayeri, et al.
0

Several existing works study either adversarial or natural distributional robustness of deep neural networks separately. In practice, however, models need to enjoy both types of robustness to ensure reliability. In this work, we bridge this gap and show that in fact, explicit tradeoffs exist between adversarial and natural distributional robustness. We first consider a simple linear regression setting on Gaussian data with disjoint sets of core and spurious features. In this setting, through theoretical and empirical analysis, we show that (i) adversarial training with ℓ_1 and ℓ_2 norms increases the model reliance on spurious features; (ii) For ℓ_∞ adversarial training, spurious reliance only occurs when the scale of the spurious features is larger than that of the core features; (iii) adversarial training can have an unintended consequence in reducing distributional robustness, specifically when spurious correlations are changed in the new test domain. Next, we present extensive empirical evidence, using a test suite of twenty adversarially trained models evaluated on five benchmark datasets (ObjectNet, RIVAL10, Salient ImageNet-1M, ImageNet-9, Waterbirds), that adversarially trained classifiers rely on backgrounds more than their standardly trained counterparts, validating our theoretical results. We also show that spurious correlations in training data (when preserved in the test domain) can improve adversarial robustness, revealing that previous claims that adversarial vulnerability is rooted in spurious correlations are incomplete.

READ FULL TEXT

page 16

page 21

research
12/21/2019

Jacobian Adversarially Regularized Networks for Robustness

Adversarial examples are crafted with imperceptible perturbations with t...
research
06/04/2019

Adversarial Training Generalizes Data-dependent Spectral Norm Regularization

We establish a theoretical link between adversarial training and operato...
research
04/06/2022

Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations

Neural network classifiers can largely rely on simple spurious features,...
research
02/27/2022

A Unified Wasserstein Distributional Robustness Framework for Adversarial Training

It is well-known that deep neural networks (DNNs) are susceptible to adv...
research
09/27/2022

Inducing Data Amplification Using Auxiliary Datasets in Adversarial Training

Several recent studies have shown that the use of extra in-distribution ...
research
02/20/2020

Boosting Adversarial Training with Hypersphere Embedding

Adversarial training (AT) is one of the most effective defenses to impro...
research
10/20/2022

Freeze then Train: Towards Provable Representation Learning under Spurious Correlations and Feature Noise

The existence of spurious correlations such as image backgrounds in the ...

Please sign up or login with your details

Forgot password? Click here to reset