Pre-trained transformer for adversarial purification

05/27/2023
by   Kai Wu, et al.
0

With more and more deep neural networks being deployed as various daily services, their reliability is essential. It's frightening that deep neural networks are vulnerable and sensitive to adversarial attacks, the most common one of which for the services is evasion-based. Recent works usually strengthen the robustness by adversarial training or leveraging the knowledge of an amount of clean data. However, in practical terms, retraining and redeploying the model need a large computational budget, leading to heavy losses to the online service. In addition, when adversarial examples of a certain attack are detected, only limited adversarial examples are available for the service provider, while much clean data may not be accessible. Given the mentioned problems, we propose a new scenario, RaPiD (Rapid Plug-in Defender), which is to rapidly defend against a certain attack for the frozen original service model with limitations of few clean and adversarial examples. Motivated by the generalization and the universal computation ability of pre-trained transformer models, we come up with a new defender method, CeTaD, which stands for Considering Pre-trained Transformers as Defenders. In particular, we evaluate the effectiveness and the transferability of CeTaD in the case of one-shot adversarial examples and explore the impact of different parts of CeTaD as well as training data conditions. CeTaD is flexible, able to be embedded into an arbitrary differentiable model, and suitable for various types of attacks.

READ FULL TEXT
research
05/10/2020

Class-Aware Domain Adaptation for Improving Adversarial Robustness

Recent works have demonstrated convolutional neural networks are vulnera...
research
03/28/2020

Adversarial Imitation Attack

Deep learning models are known to be vulnerable to adversarial examples....
research
02/28/2019

Enhancing the Robustness of Deep Neural Networks by Boundary Conditional GAN

Deep neural networks have been widely deployed in various machine learni...
research
02/28/2019

Towards Understanding Adversarial Examples Systematically: Exploring Data Size, Task and Model Factors

Most previous works usually explained adversarial examples from several ...
research
05/24/2023

Introducing Competition to Boost the Transferability of Targeted Adversarial Examples through Clean Feature Mixup

Deep neural networks are widely known to be susceptible to adversarial e...
research
03/27/2021

LiBRe: A Practical Bayesian Approach to Adversarial Detection

Despite their appealing flexibility, deep neural networks (DNNs) are vul...
research
02/28/2020

Utilizing Network Properties to Detect Erroneous Inputs

Neural networks are vulnerable to a wide range of erroneous inputs such ...

Please sign up or login with your details

Forgot password? Click here to reset