ONION: A Simple and Effective Defense Against Textual Backdoor Attacks

11/20/2020
by   Fanchao Qi, et al.
0

Backdoor attacks, which are a kind of emergent training-time threat to deep neural networks (DNNS). They can manipulate the output of DNNs and posses high insidiousness. In the field of natural language processing, some attack methods have been proposed and achieve very high attack success rates on multiple popular models. Nevertheless, the studies on defending textual backdoor defense are little conducted. In this paper, we propose a simple and effective textual backdoor defense named ONION, which is based on outlier word detection and might be the first method that can handle all the attack situations. Experiments demonstrate the effectiveness of our model when blocking two latest backdoor attack methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2022

Adversarial Color Film: Effective Physical-World Attack to DNNs

It is well known that the performance of deep neural networks (DNNs) is ...
research
09/19/2020

Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations

Adversarial attacking aims to fool deep neural networks with adversarial...
research
05/25/2022

Textual Backdoor Attacks with Iterative Trigger Injection

The backdoor attack has become an emerging threat for Natural Language P...
research
04/13/2021

Fall of Giants: How popular text-based MLaaS fall against a simple evasion attack

The increased demand for machine learning applications made companies of...
research
01/25/2023

BDMMT: Backdoor Sample Detection for Language Models through Model Mutation Testing

Deep neural networks (DNNs) and natural language processing (NLP) system...
research
03/03/2023

NCL: Textual Backdoor Defense Using Noise-augmented Contrastive Learning

At present, backdoor attacks attract attention as they do great harm to ...
research
10/10/2019

Defending Neural Backdoors via Generative Distribution Modeling

Neural backdoor attack is emerging as a severe security threat to deep l...

Please sign up or login with your details

Forgot password? Click here to reset