DeepAI AI Chat
Log In Sign Up

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

04/21/2020
by   Linyang Li, et al.
FUDAN University
0

Adversarial attacks for discrete data (such as text) has been proved significantly more challenging than continuous data (such as image), since it is difficult to generate adversarial samples with gradient-based methods. Currently, the successful attack methods for text usually adopt heuristic replacement strategies on character or word level, which remains challenging to find the optimal solution in the massive space of possible combination of replacements, while preserving semantic consistency and language fluency. In this paper, we propose BERT-Attack, a high-quality and effective method to generate adversarial samples using pre-trained masked language models exemplified by BERT. We turn BERT against its fine-tuned models and other deep neural models for downstream tasks. Our method successfully misleads the target models to predict incorrectly, outperforming state-of-the-art attack strategies in both success rate and perturb percentage, while the generated adversarial samples are fluent and semantically preserved. Also, the cost of calculation is low, thus possible for large-scale generations.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/18/2021

Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!

Natural language processing (NLP) tasks, ranging from text classificatio...
06/09/2023

COVER: A Heuristic Greedy Adversarial Attack on Prompt-based Learning in Language Models

Prompt-based learning has been proved to be an effective way in pre-trai...
05/27/2019

Combating Adversarial Misspellings with Robust Word Recognition

To combat adversarial spelling mistakes, we propose placing a word recog...
07/13/2021

Using BERT Encoding to Tackle the Mad-lib Attack in SMS Spam Detection

One of the stratagems used to deceive spam filters is to substitute voca...
05/23/2021

Killing Two Birds with One Stone: Stealing Model and Inferring Attribute from BERT-based APIs

The advances in pre-trained models (e.g., BERT, XLNET and etc) have larg...
08/10/2020

FireBERT: Hardening BERT-based classifiers against adversarial attack

We present FireBERT, a set of three proof-of-concept NLP classifiers har...
06/02/2021

BERT-Defense: A Probabilistic Model Based on BERT to Combat Cognitively Inspired Orthographic Adversarial Attacks

Adversarial attacks expose important blind spots of deep learning system...