Learnable Boundary Guided Adversarial Training

11/23/2020
by   Jiequan Cui, et al.
1

Previous adversarial training raises model robustness under the compromise of accuracy on natural data. In this paper, our target is to reduce natural accuracy degradation. We use the model logits from one clean model ℳ^natural to guide learning of the robust model ℳ^robust, taking into consideration that logits from the well trained clean model ℳ^natural embed the most discriminative features of natural data, e.g., generalizable classifier boundary. Our solution is to constrain logits from the robust model ℳ^robust that takes adversarial examples as input and make it similar to those from a clean model ℳ^natural fed with corresponding natural data. It lets ℳ^robust inherit the classifier boundary of ℳ^natural. Thus, we name our method Boundary Guided Adversarial Training (BGAT). Moreover, we generalize BGAT to Learnable Boundary Guided Adversarial Training (LBGAT) by training ℳ^natural and ℳ^robust simultaneously and collaboratively to learn one most robustness-friendly classifier boundary for the strongest robustness. Extensive experiments are conducted on CIFAR-10, CIFAR-100, and challenging Tiny ImageNet datasets. Along with other state-of-the-art adversarial training approaches, e.g., Adversarial Logit Pairing (ALP) and TRADES, the performance is further enhanced.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2019

Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Accuracy

Adversarial robustness has become a central goal in deep learning, both ...
research
09/29/2021

BulletTrain: Accelerating Robust Neural Network Training via Boundary Example Mining

Neural network robustness has become a central topic in machine learning...
research
02/06/2023

Exploring and Exploiting Decision Boundary Dynamics for Adversarial Robustness

The robustness of a deep classifier can be characterized by its margins:...
research
02/06/2021

Understanding the Interaction of Adversarial Training with Noisy Labels

Noisy labels (NL) and adversarial examples both undermine trained models...
research
05/31/2021

Adversarial Training with Rectified Rejection

Adversarial training (AT) is one of the most effective strategies for pr...
research
03/07/2023

CUDA: Convolution-based Unlearnable Datasets

Large-scale training of modern deep learning models heavily relies on pu...
research
06/15/2023

Exact Count of Boundary Pieces of ReLU Classifiers: Towards the Proper Complexity Measure for Classification

Classic learning theory suggests that proper regularization is the key t...

Please sign up or login with your details

Forgot password? Click here to reset