Generating Fluent Adversarial Examples for Natural Languages

07/13/2020
by   Huangzhao Zhang, et al.
0

Efficiently building an adversarial attacker for natural language processing (NLP) tasks is a real challenge. Firstly, as the sentence space is discrete, it is difficult to make small perturbations along the direction of gradients. Secondly, the fluency of the generated examples cannot be guaranteed. In this paper, we propose MHA, which addresses both problems by performing Metropolis-Hastings sampling, whose proposal is designed with the guidance of gradients. Experiments on IMDB and SNLI show that our proposed MHA outperforms the baseline model on attacking capability. Adversarial training with MAH also leads to better robustness and performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/03/2020

A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples

Generating adversarial examples for natural language is hard, as natural...
research
11/26/2021

Simple Contrastive Representation Adversarial Learning for NLP Tasks

Self-supervised learning approach like contrastive learning is attached ...
research
05/08/2018

Interpretable Adversarial Perturbation in Input Embedding Space for Text

Following great success in the image processing field, the idea of adver...
research
04/17/2022

Residue-Based Natural Language Adversarial Attack Detection

Deep learning based systems are susceptible to adversarial attacks, wher...
research
10/04/2020

Adversarial Attack and Defense of Structured Prediction Models

Building an effective adversarial attacker and elaborating on countermea...
research
09/15/2021

ARCH: Efficient Adversarial Regularized Training with Caching

Adversarial regularization can improve model generalization in many natu...
research
10/07/2022

A2: Efficient Automated Attacker for Boosting Adversarial Training

Based on the significant improvement of model robustness by AT (Adversar...

Please sign up or login with your details

Forgot password? Click here to reset