BERT-ATTACK: Adversarial Attack Against BERT Using BERT

04/21/2020
by   Linyang Li, et al.
0

Adversarial attacks for discrete data (such as text) has been proved significantly more challenging than continuous data (such as image), since it is difficult to generate adversarial samples with gradient-based methods. Currently, the successful attack methods for text usually adopt heuristic replacement strategies on character or word level, which remains challenging to find the optimal solution in the massive space of possible combination of replacements, while preserving semantic consistency and language fluency. In this paper, we propose BERT-Attack, a high-quality and effective method to generate adversarial samples using pre-trained masked language models exemplified by BERT. We turn BERT against its fine-tuned models and other deep neural models for downstream tasks. Our method successfully misleads the target models to predict incorrectly, outperforming state-of-the-art attack strategies in both success rate and perturb percentage, while the generated adversarial samples are fluent and semantically preserved. Also, the cost of calculation is low, thus possible for large-scale generations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2021

Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!

Natural language processing (NLP) tasks, ranging from text classificatio...
research
06/09/2023

COVER: A Heuristic Greedy Adversarial Attack on Prompt-based Learning in Language Models

Prompt-based learning has been proved to be an effective way in pre-trai...
research
05/27/2019

Combating Adversarial Misspellings with Robust Word Recognition

To combat adversarial spelling mistakes, we propose placing a word recog...
research
07/13/2021

Using BERT Encoding to Tackle the Mad-lib Attack in SMS Spam Detection

One of the stratagems used to deceive spam filters is to substitute voca...
research
05/23/2021

Killing Two Birds with One Stone: Stealing Model and Inferring Attribute from BERT-based APIs

The advances in pre-trained models (e.g., BERT, XLNET and etc) have larg...
research
08/10/2020

FireBERT: Hardening BERT-based classifiers against adversarial attack

We present FireBERT, a set of three proof-of-concept NLP classifiers har...
research
10/28/2021

Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial Attack Framework

Despite great success on many machine learning tasks, deep neural networ...

Please sign up or login with your details

Forgot password? Click here to reset