Adversarial Examples with Difficult Common Words for Paraphrase Identification

09/05/2019
by   Zhouxing Shi, et al.
0

Despite the success of deep models for paraphrase identification on benchmark datasets, these models are still vulnerable to adversarial examples. In this paper, we propose a novel algorithm to generate a new type of adversarial examples to study the robustness of deep paraphrase identification models. We first sample an original sentence pair from the corpus and then adversarially replace some word pairs with difficult common words. We take multiple steps and use beam search to find a modification solution that makes the target model fail, and thereby obtain an adversarial example. The word replacement is also constrained by heuristic rules and a language model, to preserve the label and grammaticality of the example during modification. Experiments show that our algorithm can generate adversarial examples on which the performance of the target model drops dramatically. Meanwhile, human annotators are much less affected, and the generated sentences retain a good grammaticality. We also show that adversarial training with generated adversarial examples can improve model robustness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/20/2022

Balanced Adversarial Training: Balancing Tradeoffs between Fickleness and Obstinacy in NLP Models

Traditional (fickle) adversarial examples involve finding a small pertur...
research
03/23/2022

Adversarial Training for Improving Model Robustness? Look at Both Prediction and Interpretation

Neural language models show vulnerability to adversarial examples which ...
research
12/29/2020

Generating Adversarial Examples in Chinese Texts Using Sentence-Pieces

Adversarial attacks in texts are mostly substitution-based methods that ...
research
12/18/2020

AdvExpander: Generating Natural Language Adversarial Examples by Expanding Text

Adversarial examples are vital to expose the vulnerability of machine le...
research
11/09/2018

Adversarial Sampling and Training for Semi-Supervised Information Retrieval

Modern ad-hoc retrieval models learned with implicit feedback have two p...
research
04/17/2018

Adversarial Example Generation with Syntactically Controlled Paraphrase Networks

We propose syntactically controlled paraphrase networks (SCPNs) and use ...
research
10/20/2022

Identifying Human Strategies for Generating Word-Level Adversarial Examples

Adversarial examples in NLP are receiving increasing research attention....

Please sign up or login with your details

Forgot password? Click here to reset