Improving the Generalization of Adversarial Training with Domain Adaptation

10/01/2018
by   Chuanbiao Song, et al.
0

By injecting adversarial examples into training data, the adversarial training method is promising for improving the robustness of deep learning models. However, most existing adversarial training approaches are based on a specific type of adversarial attack. It may not provide sufficiently representative samples from the adversarial domain, leading to a weak generalization ability on adversarial examples from other attacks. To scale to large datasets, perturbations on inputs to generate adversarial examples are usually crafted using fast single-step attacks. This work is mainly focused on the adversarial training with the single-step yet efficient FGSM adversary. In this scenario, it is difficult to train a model with great generalization due to the lack of representative adversarial samples, aka the samples are unable to accurately reflect the adversarial domain. To address this problem, we propose a novel Adversarial Training with Domain Adaptation (ATDA) method by regarding the adversarial training with FGSM adversary as a domain adaption task with limited number of target domain samples. The main idea is to learn a representation that is semantically meaningful and domain invariant on the clean domain as well as the adversarial domain. Empirical evaluations demonstrate that ATDA can greatly improve the generalization of adversarial training and achieves state-of-the-art results on standard benchmark datasets.

READ FULL TEXT
research
05/10/2020

Class-Aware Domain Adaptation for Improving Adversarial Robustness

Recent works have demonstrated convolutional neural networks are vulnera...
research
12/01/2021

Adv-4-Adv: Thwarting Changing Adversarial Perturbations via Adversarial Domain Adaptation

Whereas adversarial training can be useful against specific adversarial ...
research
06/16/2022

A Closer Look at Smoothness in Domain Adversarial Training

Domain adversarial training has been ubiquitous for achieving invariant ...
research
07/20/2023

Shared Adversarial Unlearning: Backdoor Mitigation by Unlearning Shared Adversarial Examples

Backdoor attacks are serious security threats to machine learning models...
research
12/18/2022

On the Connection between Invariant Learning and Adversarial Training for Out-of-Distribution Generalization

Despite impressive success in many tasks, deep learning models are shown...
research
03/25/2021

Deepfake Forensics via An Adversarial Game

With the progress in AI-based facial forgery (i.e., deepfake), people ar...
research
03/03/2022

Improving Health Mentioning Classification of Tweets using Contrastive Adversarial Training

Health mentioning classification (HMC) classifies an input text as healt...

Please sign up or login with your details

Forgot password? Click here to reset