Why Clean Generalization and Robust Overfitting Both Happen in Adversarial Training

06/02/2023
by   Binghui Li, et al.
0

Adversarial training is a standard method to train deep neural networks to be robust to adversarial perturbation. Similar to surprising clean generalization ability in the standard deep learning setting, neural networks trained by adversarial training also generalize well for unseen clean data. However, in constrast with clean generalization, while adversarial training method is able to achieve low robust training error, there still exists a significant robust generalization gap, which promotes us exploring what mechanism leads to both clean generalization and robust overfitting (CGRO) during learning process. In this paper, we provide a theoretical understanding of this CGRO phenomenon in adversarial training. First, we propose a theoretical framework of adversarial training, where we analyze feature learning process to explain how adversarial training leads network learner to CGRO regime. Specifically, we prove that, under our patch-structured dataset, the CNN model provably partially learns the true feature but exactly memorizes the spurious features from training-adversarial examples, which thus results in clean generalization and robust overfitting. For more general data assumption, we then show the efficiency of CGRO classifier from the perspective of representation complexity. On the empirical side, to verify our theoretical analysis in real-world vision dataset, we investigate the dynamics of loss landscape during training. Moreover, inspired by our experiments, we prove a robust generalization bound based on global flatness of loss landscape, which may be an independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/09/2022

Understanding and Combating Robust Overfitting via Input Loss Landscape Analysis and Regularization

Adversarial training is widely used to improve the robustness of deep ne...
research
11/25/2022

Boundary Adversarial Examples Against Adversarial Overfitting

Standard adversarial training approaches suffer from robust overfitting ...
research
03/03/2023

Certified Robust Neural Networks: Generalization and Corruption Resistance

Adversarial training aims to reduce the problematic susceptibility of mo...
research
10/03/2022

Stability Analysis and Generalization Bounds of Adversarial Training

In adversarial machine learning, deep neural networks can fit the advers...
research
05/27/2022

Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power

It is well-known that modern neural networks are vulnerable to adversari...
research
12/31/2021

Benign Overfitting in Adversarially Robust Linear Classification

"Benign overfitting", where classifiers memorize noisy training data yet...
research
09/03/2021

How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data

Since training a large-scale backdoored model from scratch requires a la...

Please sign up or login with your details

Forgot password? Click here to reset