On Adversarial Examples for Character-Level Neural Machine Translation

06/23/2018
by   Javid Ebrahimi, et al.
0

Evaluating on adversarial examples has become a standard procedure to measure robustness of deep learning models. Due to the difficulty of creating white-box adversarial examples for discrete text input, most analyses of the robustness of NLP models have been done through black-box adversarial examples. We investigate adversarial examples for character-level neural machine translation (NMT), and contrast black-box adversaries with a novel white-box adversary, which employs differentiable string-edit operations to rank adversarial changes. We propose two novel types of attacks which aim to remove or change a word in a translation, rather than simply break the NMT. We demonstrate that white-box adversarial examples are significantly stronger than their black-box counterparts in different attack scenarios, which show more serious vulnerabilities than previously known. In addition, after performing adversarial training, which takes only 3 times longer than regular training, we can improve the model's robustness significantly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2017

HotFlip: White-Box Adversarial Examples for NLP

Adversarial examples expose vulnerabilities of machine learning models. ...
research
07/21/2018

Simultaneous Adversarial Training - Learn from Others Mistakes

Adversarial examples are maliciously tweaked images that can easily fool...
research
02/02/2023

TransFool: An Adversarial Attack against Neural Machine Translation Models

Deep neural networks have been shown to be vulnerable to small perturbat...
research
10/31/2017

Generating Natural Adversarial Examples

Due to their complex nature, it is hard to characterize the ways in whic...
research
10/21/2021

PROVES: Establishing Image Provenance using Semantic Signatures

Modern AI tools, such as generative adversarial networks, have transform...
research
09/11/2020

Robust Neural Machine Translation: Modeling Orthographic and Interpunctual Variation

Neural machine translation systems typically are trained on curated corp...
research
11/15/2019

Learning To Characterize Adversarial Subspaces

Deep Neural Networks (DNNs) are known to be vulnerable to the maliciousl...

Please sign up or login with your details

Forgot password? Click here to reset