HotFlip: White-Box Adversarial Examples for NLP

12/19/2017
by   Javid Ebrahimi, et al.
0

Adversarial examples expose vulnerabilities of machine learning models. We propose an efficient method to generate white-box adversarial examples that trick character-level and word-level neural models. Our method, HotFlip, relies on an atomic flip operation, which swaps one token for another, based on the gradients of the one-hot input vectors. In experiments on text classification and machine translation, we find that only a few manipulations are needed to greatly increase the error rates. We analyze the properties of these examples, and show that employing these adversarial examples in training can improve test-time accuracy on clean examples, as well as defend the models against adversarial examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2018

On Adversarial Examples for Character-Level Neural Machine Translation

Evaluating on adversarial examples has become a standard procedure to me...
research
01/22/2020

Elephant in the Room: An Evaluation Framework for Assessing Adversarial Examples in NLP

An adversarial example is an input transformed by small perturbations th...
research
01/28/2022

Adversarial Examples for Good: Adversarial Examples Guided Imbalanced Learning

Adversarial examples are inputs for machine learning models that have be...
research
08/20/2019

Universal Adversarial Triggers for NLP

Adversarial examples highlight model vulnerabilities and are useful for ...
research
08/20/2019

Universal Adversarial Triggers for Attacking and Analyzing NLP

Adversarial examples highlight model vulnerabilities and are useful for ...
research
10/31/2022

Character-level White-Box Adversarial Attacks against Transformers via Attachable Subwords Substitution

We propose the first character-level white-box adversarial attack method...
research
10/20/2022

Identifying Human Strategies for Generating Word-Level Adversarial Examples

Adversarial examples in NLP are receiving increasing research attention....

Please sign up or login with your details

Forgot password? Click here to reset