Attacking Text Classifiers via Sentence Rewriting Sampler

04/17/2021
by   Lei Xu, et al.
0

Most adversarial attack methods on text classification are designed to change the classifier's prediction by modifying few words or characters. Few try to attack classifiers by rewriting a whole sentence, due to the difficulties inherent in sentence-level rephrasing and the problem of maintaining high semantic similarity and sentence quality. To tackle this problem, we design a general sentence rewriting sampler (SRS) framework, which can conditionally generate meaningful sentences. Then we customize SRS to attack text classification models. Our method can effectively rewrite the original sentence in multiple ways while maintaining high semantic similarity and good sentence quality. Experimental results show that many of these rewritten sentences are misclassified by the classifier. Our method achieves a better attack success rate on 4 out of 7 datasets, as well as significantly better sentence quality on all 7 datasets.

READ FULL TEXT

page 4

page 5

page 8

page 9

page 11

page 12

page 13

page 14

research
10/22/2020

Rewriting Meaningful Sentences via Conditional BERT Sampling and an application on fooling text classifiers

Most adversarial attack methods that are designed to deceive a text clas...
research
01/20/2020

Short Text Classification via Term Graph

Short text classi cation is a method for classifying short sentence with...
research
05/29/2017

Character-Based Text Classification using Top Down Semantic Model for Sentence Representation

Despite the success of deep learning on many fronts especially image and...
research
12/01/2018

Discrete Attacks and Submodular Optimization with Applications to Text Classification

Adversarial examples are carefully constructed modifications to an input...
research
08/23/2021

Semantic-Preserving Adversarial Text Attacks

Deep neural networks (DNNs) are known to be vulnerable to adversarial im...
research
09/26/2019

Rethinking Text Attribute Transfer: A Lexical Analysis

Text attribute transfer is modifying certain linguistic attributes (e.g....
research
03/25/2023

Backdoor Attacks with Input-unique Triggers in NLP

Backdoor attack aims at inducing neural models to make incorrect predict...

Please sign up or login with your details

Forgot password? Click here to reset