Towards Variable-Length Textual Adversarial Attacks

04/16/2021
by   Junliang Guo, et al.
0

Adversarial attacks have shown the vulnerability of machine learning models, however, it is non-trivial to conduct textual adversarial attacks on natural language processing tasks due to the discreteness of data. Most previous approaches conduct attacks with the atomic replacement operation, which usually leads to fixed-length adversarial examples and therefore limits the exploration on the decision space. In this paper, we propose variable-length textual adversarial attacks (VL-Attack) and integrate three atomic operations, namely insertion, deletion and replacement, into a unified framework, by introducing and manipulating a special blank token while attacking. In this way, our approach is able to more comprehensively find adversarial examples around the decision boundary and effectively conduct adversarial attacks. Specifically, our method drops the accuracy of IMDB classification by 96% with only editing 1.3% tokens while attacking a pre-trained BERT model. In addition, fine-tuning the victim model with generated adversarial samples can improve the robustness of the model without hurting the performance, especially for length-sensitive models. On the task of non-autoregressive machine translation, our method can achieve 33.18 BLEU score on IWSLT14 German-English translation, achieving an improvement of 1.47 over the baseline model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2022

Learning to Ignore Adversarial Attacks

Despite the strong performance of current NLP models, they can be brittl...
research
09/05/2022

Evaluating the Susceptibility of Pre-Trained Language Models via Handcrafted Adversarial Examples

Recent advances in the development of large language models have resulte...
research
02/06/2023

Less is More: Understanding Word-level Textual Adversarial Attack via n-gram Frequency Descend

Word-level textual adversarial attacks have achieved striking performanc...
research
03/01/2021

Token-Modification Adversarial Attacks for Natural Language Processing: A Survey

There are now many adversarial attacks for natural language processing s...
research
06/19/2020

Differentiable Language Model Adversarial Attacks on Categorical Sequence Classifiers

An adversarial attack paradigm explores various scenarios for the vulner...
research
10/22/2022

ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation

Adversarial Examples Detection (AED) is a crucial defense technique agai...
research
05/24/2022

Defending a Music Recommender Against Hubness-Based Adversarial Attacks

Adversarial attacks can drastically degrade performance of recommenders ...

Please sign up or login with your details

Forgot password? Click here to reset