Learning-based Hybrid Local Search for the Hard-label Textual Attack

01/20/2022
by   Zhen Yu, et al.
0

Deep neural networks are vulnerable to adversarial examples in Natural Language Processing. However, existing textual adversarial attacks usually utilize the gradient or prediction confidence to generate adversarial examples, making it hard to be deployed in real-world applications. To this end, we consider a rarely investigated but more rigorous setting, namely hard-label attack, in which the attacker could only access the prediction label. In particular, we find that the changes on prediction label caused by word substitutions on the adversarial example could precisely reflect the importance of different words. Based on this observation, we propose a novel hard-label attack, called Learning-based Hybrid Local Search (LHLS) algorithm, which effectively estimates word importance with the prediction label from the attack history and integrate such information into hybrid local search algorithm to optimize the adversarial perturbation. Extensive evaluations for text classification and textual entailment using various datasets and models show that our LHLS significantly outperforms existing hard-label attacks regarding the attack performance as well as adversary quality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/01/2023

LimeAttack: Local Explainable Method for Textual Hard-Label Adversarial Attack

Natural language processing models are vulnerable to adversarial example...
research
12/24/2020

A Context Aware Approach for Generating Natural Language Attacks

We study an important task of attacking natural language processing mode...
research
12/29/2020

Generating Natural Language Attacks in a Hard Label Black Box Setting

We study an important and challenging task of attacking natural language...
research
09/13/2021

Randomized Substitution and Vote for Textual Adversarial Example Detection

A line of work has shown that natural text processing models are vulnera...
research
09/06/2021

Efficient Combinatorial Optimization for Word-level Adversarial Textual Attack

Over the past few years, various word-level textual attack approaches ha...
research
05/22/2022

Phrase-level Textual Adversarial Attack with Label Preservation

Generating high-quality textual adversarial examples is critical for inv...
research
02/17/2020

Query-Efficient Physical Hard-Label Attacks on Deep Learning Visual Classification

We present Survival-OPT, a physical adversarial example algorithm in the...

Please sign up or login with your details

Forgot password? Click here to reset