NCL: Textual Backdoor Defense Using Noise-augmented Contrastive Learning

03/03/2023
by   Shengfang Zhai, et al.
0

At present, backdoor attacks attract attention as they do great harm to deep learning models. The adversary poisons the training data making the model being injected with a backdoor after being trained unconsciously by victims using the poisoned dataset. In the field of text, however, existing works do not provide sufficient defense against backdoor attacks. In this paper, we propose a Noise-augmented Contrastive Learning (NCL) framework to defend against textual backdoor attacks when training models with untrustworthy data. With the aim of mitigating the mapping between triggers and the target label, we add appropriate noise perturbing possible backdoor triggers, augment the training dataset, and then pull homology samples in the feature space utilizing contrastive learning objective. Experiments demonstrate the effectiveness of our method in defending three types of textual backdoor attacks, outperforming the prior works.

READ FULL TEXT
research
06/03/2022

Kallima: A Clean-label Framework for Textual Backdoor Attacks

Although Deep Neural Network (DNN) has led to unprecedented progress in ...
research
02/22/2022

Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning

Indiscriminate data poisoning attacks are quite effective against superv...
research
11/20/2020

ONION: A Simple and Effective Defense Against Textual Backdoor Attacks

Backdoor attacks, which are a kind of emergent training-time threat to d...
research
09/15/2023

HINT: Healthy Influential-Noise based Training to Defend against Data Poisoning Attacks

While numerous defense methods have been proposed to prohibit potential ...
research
07/11/2020

Mitigating backdoor attacks in LSTM-based Text Classification Systems by Backdoor Keyword Identification

It has been proved that deep neural networks are facing a new threat cal...
research
06/17/2021

Poisoning and Backdooring Contrastive Learning

Contrastive learning methods like CLIP train on noisy and uncurated trai...
research
06/17/2022

A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks

Textual backdoor attacks are a kind of practical threat to NLP systems. ...

Please sign up or login with your details

Forgot password? Click here to reset