Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples

03/03/2018
by   Minhao Cheng, et al.
0

Crafting adversarial examples has become an important technique to evaluate the robustness of deep neural networks (DNNs). However, most existing works focus on attacking the image classification problem since its input space is continuous and output space is finite. In this paper, we study the much more challenging problem of crafting adversarial examples for sequence-to-sequence (seq2seq) models, whose inputs are discrete text strings and outputs have an almost infinite number of possibilities. To address the challenges caused by the discrete input space, we propose a projected gradient method combined with group lasso and gradient regularization. To handle the almost infinite output space, we design some novel loss functions to conduct non-overlapping attack and targeted keyword attack. We apply our algorithm to machine translation and text summarization tasks, and verify the effectiveness of the proposed algorithm: by changing less than 3 words, we can make seq2seq model to produce desired outputs with high success rates. On the other hand, we recognize that, compared with the well-evaluated CNN-based classifiers, seq2seq models are intrinsically more robust to adversarial attacks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2023

Backdoor Learning on Sequence to Sequence Models

Backdoor learning has become an emerging research area towards building ...
research
08/09/2020

Fast Gradient Projection Method for Text Adversary Generation and Adversarial Training

Adversarial training has shown effectiveness and efficiency in improving...
research
08/20/2021

AdvDrop: Adversarial Attack to DNNs by Dropping Information

Human can easily recognize visual objects with lost information: even lo...
research
12/06/2017

Show-and-Fool: Crafting Adversarial Examples for Neural Image Captioning

Modern neural image captioning systems typically adopt the encoder-decod...
research
07/22/2021

Spinning Sequence-to-Sequence Models with Meta-Backdoors

We investigate a new threat to neural sequence-to-sequence (seq2seq) mod...
research
10/17/2022

Probabilistic Categorical Adversarial Attack Adversarial Training

The existence of adversarial examples brings huge concern for people to ...
research
08/05/2021

BOSS: Bidirectional One-Shot Synthesis of Adversarial Examples

The design of additive imperceptible perturbations to the inputs of deep...

Please sign up or login with your details

Forgot password? Click here to reset