DeepAI AI Chat
Log In Sign Up

Preserving Semantics in Textual Adversarial Attacks

11/08/2022
by   David Herel, et al.
Czech Technical University in Prague
0

Adversarial attacks in NLP challenge the way we look at language models. The goal of this kind of adversarial attack is to modify the input text to fool a classifier while maintaining the original meaning of the text. Although most existing adversarial attacks claim to fulfill the constraint of semantics preservation, careful scrutiny shows otherwise. We show that the problem lies in the text encoders used to determine the similarity of adversarial examples, specifically in the way they are trained. Unsupervised training methods make these encoders more susceptible to problems with antonym recognition. To overcome this, we introduce a simple, fully supervised sentence embedding technique called Semantics-Preserving-Encoder (SPE). The results show that our solution minimizes the variation in the meaning of the adversarial examples generated. It also significantly improves the overall quality of adversarial examples, as confirmed by human evaluators. Furthermore, it can be used as a component in any existing attack to speed up its execution while maintaining similar attack success.

READ FULL TEXT

page 1

page 2

page 3

page 4

04/25/2020

Reevaluating Adversarial Examples in Natural Language

State-of-the-art attacks on NLP models have different definitions of wha...
09/09/2021

Contrasting Human- and Machine-Generated Word-Level Adversarial Examples for Text Classification

Research shows that natural language processing models are generally con...
11/04/2021

Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models

Large-scale pre-trained language models have achieved tremendous success...
02/08/2020

Attacking Optical Character Recognition (OCR) Systems with Adversarial Watermarks

Optical character recognition (OCR) is widely applied in real applicatio...
06/01/2023

Constructing Semantics-Aware Adversarial Examples with Probabilistic Perspective

In this study, we introduce a novel, probabilistic viewpoint on adversar...
09/13/2021

Adversarial Examples for Evaluating Math Word Problem Solvers

Standard accuracy metrics have shown that Math Word Problem (MWP) solver...