Annealing Self-Distillation Rectification Improves Adversarial Training

05/20/2023
by   Yu-Yu Wu, et al.
0

In standard adversarial training, models are optimized to fit one-hot labels within allowable adversarial perturbation budgets. However, the ignorance of underlying distribution shifts brought by perturbations causes the problem of robust overfitting. To address this issue and enhance adversarial robustness, we analyze the characteristics of robust models and identify that robust models tend to produce smoother and well-calibrated outputs. Based on the observation, we propose a simple yet effective method, Annealing Self-Distillation Rectification (ADR), which generates soft labels as a better guidance mechanism that accurately reflects the distribution shift under attack during adversarial training. By utilizing ADR, we can obtain rectified distributions that significantly improve model robustness without the need for pre-trained models or extensive extra computation. Moreover, our method facilitates seamless plug-and-play integration with other adversarial training techniques by replacing the hard labels in their objectives. We demonstrate the efficacy of ADR through extensive experiments and strong performances across datasets.

READ FULL TEXT
research
08/18/2021

Revisiting Adversarial Robustness Distillation: Robust Soft Labels Make Student Better

Adversarial training is one effective approach for training robust deep ...
research
06/25/2023

Enhancing Adversarial Training via Reweighting Optimization Trajectory

Despite the fact that adversarial training has become the de facto metho...
research
01/25/2023

A Study on FGSM Adversarial Training for Neural Retrieval

Neural retrieval models have acquired significant effectiveness gains ov...
research
06/05/2022

Vanilla Feature Distillation for Improving the Accuracy-Robustness Trade-Off in Adversarial Training

Adversarial training has been widely explored for mitigating attacks aga...
research
08/19/2022

DAFT: Distilling Adversarially Fine-tuned Models for Better OOD Generalization

We consider the problem of OOD generalization, where the goal is to trai...
research
08/15/2023

SEDA: Self-Ensembling ViT with Defensive Distillation and Adversarial Training for robust Chest X-rays Classification

Deep Learning methods have recently seen increased adoption in medical i...
research
09/18/2020

Prepare for the Worst: Generalizing across Domain Shifts with Adversarial Batch Normalization

Adversarial training is the industry standard for producing models that ...

Please sign up or login with your details

Forgot password? Click here to reset