Gradient-based Adversarial Attacks against Text Transformers

04/15/2021
by   Chuan Guo, et al.
0

We propose the first general-purpose gradient-based attack against transformer models. Instead of searching for a single adversarial example, we search for a distribution of adversarial examples parameterized by a continuous-valued matrix, hence enabling gradient-based optimization. We empirically demonstrate that our white-box attack attains state-of-the-art attack performance on a variety of natural language tasks. Furthermore, we show that a powerful black-box transfer attack, enabled by sampling from the adversarial distribution, matches or exceeds existing methods, while only requiring hard-label outputs.

READ FULL TEXT
research
08/09/2021

Meta Gradient Adversarial Attack

In recent years, research on adversarial attacks has become a hot spot. ...
research
10/28/2021

Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial Attack Framework

Despite great success on many machine learning tasks, deep neural networ...
research
10/31/2022

Character-level White-Box Adversarial Attacks against Transformers via Attachable Subwords Substitution

We propose the first character-level white-box adversarial attack method...
research
05/05/2023

White-Box Multi-Objective Adversarial Attack on Dialogue Generation

Pre-trained transformers are popular in state-of-the-art dialogue genera...
research
03/11/2022

Block-Sparse Adversarial Attack to Fool Transformer-Based Text Classifiers

Recently, it has been shown that, in spite of the significant performanc...
research
02/10/2023

Step by Step Loss Goes Very Far: Multi-Step Quantization for Adversarial Text Attacks

We propose a novel gradient-based attack against transformer-based langu...
research
11/02/2021

Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks

Adversarial attacks based on randomized search schemes have obtained sta...

Please sign up or login with your details

Forgot password? Click here to reset