Target Model Agnostic Adversarial Attacks with Query Budgets on Language Understanding Models

06/13/2021
by   Jatin Chauhan, et al.
9

Despite significant improvements in natural language understanding models with the advent of models like BERT and XLNet, these neural-network based classifiers are vulnerable to blackbox adversarial attacks, where the attacker is only allowed to query the target model outputs. We add two more realistic restrictions on the attack methods, namely limiting the number of queries allowed (query budget) and crafting attacks that easily transfer across different pre-trained models (transferability), which render previous attack models impractical and ineffective. Here, we propose a target model agnostic adversarial attack method with a high degree of attack transferability across the attacked models. Our empirical studies show that in comparison to baseline methods, our method generates highly transferable adversarial sentences under the restriction of limited query budgets.

READ FULL TEXT

page 8

page 14

research
11/07/2019

Active Learning for Black-Box Adversarial Attacks in EEG-Based Brain-Computer Interfaces

Deep learning has made significant breakthroughs in many fields, includi...
research
03/18/2021

Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!

Natural language processing (NLP) tasks, ranging from text classificatio...
research
04/20/2020

Headless Horseman: Adversarial Attacks on Transfer Learning Models

Transfer learning facilitates the training of task-specific classifiers ...
research
01/27/2021

Adversarial Stylometry in the Wild: Transferable Lexical Substitution Attacks on Author Profiling

Written language contains stylistic cues that can be exploited to automa...
research
04/29/2022

Logically Consistent Adversarial Attacks for Soft Theorem Provers

Recent efforts within the AI community have yielded impressive results t...
research
12/08/2021

SNEAK: Synonymous Sentences-Aware Adversarial Attack on Natural Language Video Localization

Natural language video localization (NLVL) is an important task in the v...
research
09/17/2020

Generating Label Cohesive and Well-Formed Adversarial Claims

Adversarial attacks reveal important vulnerabilities and flaws of traine...

Please sign up or login with your details

Forgot password? Click here to reset