M-to-N Backdoor Paradigm: A Stealthy and Fuzzy Attack to Deep Learning Models

11/03/2022
by   Linshan Hou, et al.
0

Recent studies show that deep neural networks (DNNs) are vulnerable to backdoor attacks. A backdoor DNN model behaves normally with clean inputs, whereas outputs attacker's expected behaviors when the inputs contain a pre-defined pattern called a trigger. However, in some tasks, the attacker cannot know the exact target that shows his/her expected behavior, because the task may contain a large number of classes and the attacker does not have full access to know the semantic details of these classes. Thus, the attacker is willing to attack multiple suspected targets to achieve his/her purpose. In light of this, in this paper, we propose the M-to-N backdoor attack, a new attack paradigm that allows an attacker to launch a fuzzy attack by simultaneously attacking N suspected targets, and each of the N targets can be activated by any one of its M triggers. To achieve a better stealthiness, we randomly select M clean images from the training dataset as our triggers for each target. Since the triggers used in our attack have the same distribution as the clean images, the inputs poisoned by the triggers are difficult to be detected by the input-based defenses, and the backdoor models trained on the poisoned training dataset are also difficult to be detected by the model-based defenses. Thus, our attack is stealthier and has a higher probability of achieving the attack purpose by attacking multiple suspected targets simultaneously in contrast to prior backdoor attacks. Extensive experiments show that our attack is effective against different datasets with various models and achieves high attack success rates (e.g., 99.43 targets and 98.23 poisoning only an extremely small portion of the training dataset (e.g., less than 2 state-of-the-art defenses.

READ FULL TEXT

page 6

page 9

page 10

page 12

page 13

research
05/13/2022

PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning

Contrastive learning pre-trains an image encoder using a large amount of...
research
03/07/2021

T-Miner: A Generative Approach to Defend Against Trojan Attacks on DNN-based Text Classification

Deep Neural Network (DNN) classifiers are known to be vulnerable to Troj...
research
02/20/2023

Poisoning Web-Scale Training Datasets is Practical

Deep learning models are often trained on distributed, webscale datasets...
research
10/07/2021

Adversarial Unlearning of Backdoors via Implicit Hypergradient

We propose a minimax formulation for removing backdoors from a given poi...
research
06/08/2021

Handcrafted Backdoors in Deep Neural Networks

Deep neural networks (DNNs), while accurate, are expensive to train. Man...
research
11/09/2021

A Statistical Difference Reduction Method for Escaping Backdoor Detection

Recent studies show that Deep Neural Networks (DNNs) are vulnerable to b...
research
12/30/2020

Explainability Matters: Backdoor Attacks on Medical Imaging

Deep neural networks have been shown to be vulnerable to backdoor attack...

Please sign up or login with your details

Forgot password? Click here to reset