Common Knowledge Learning for Generating Transferable Adversarial Examples

07/01/2023
by   Ruijie Yang, et al.
0

This paper focuses on an important type of black-box attacks, i.e., transfer-based adversarial attacks, where the adversary generates adversarial examples by a substitute (source) model and utilize them to attack an unseen target model, without knowing its information. Existing methods tend to give unsatisfactory adversarial transferability when the source and target models are from different types of DNN architectures (e.g. ResNet-18 and Swin Transformer). In this paper, we observe that the above phenomenon is induced by the output inconsistency problem. To alleviate this problem while effectively utilizing the existing DNN models, we propose a common knowledge learning (CKL) framework to learn better network weights to generate adversarial examples with better transferability, under fixed network architectures. Specifically, to reduce the model-specific features and obtain better output distributions, we construct a multi-teacher framework, where the knowledge is distilled from different teacher architectures into one student network. By considering that the gradient of input is usually utilized to generated adversarial examples, we impose constraints on the gradients between the student and teacher models, to further alleviate the output inconsistency problem and enhance the adversarial transferability. Extensive experiments demonstrate that our proposed work can significantly improve the adversarial transferability.

READ FULL TEXT

page 2

page 5

page 10

research
06/16/2022

Boosting the Adversarial Transferability of Surrogate Model with Dark Knowledge

Deep neural networks (DNNs) for image classification are known to be vul...
research
07/02/2020

Generating Adversarial Examples withControllable Non-transferability

Adversarial attacks against Deep Neural Networks have been widely studie...
research
02/28/2022

Enhance transferability of adversarial examples with model architecture

Transferability of adversarial examples is of critical importance to lau...
research
09/13/2021

Evolving Architectures with Gradient Misalignment toward Low Adversarial Transferability

Deep neural network image classifiers are known to be susceptible not on...
research
10/20/2022

Similarity of Neural Architectures Based on Input Gradient Transferability

In this paper, we aim to design a quantitative similarity function betwe...
research
01/08/2020

To Transfer or Not to Transfer: Misclassification Attacks Against Transfer Learned Text Classifiers

Transfer learning — transferring learned knowledge — has brought a parad...
research
12/31/2022

Tracing the Origin of Adversarial Attack for Forensic Investigation and Deterrence

Deep neural networks are vulnerable to adversarial attacks. In this pape...

Please sign up or login with your details

Forgot password? Click here to reset