Towards Robustness Against Natural Language Word Substitutions

07/28/2021
by   Xinshuai Dong, et al.
0

Robustness against word substitutions has a well-defined and widely acceptable form, i.e., using semantically similar words as substitutions, and thus it is considered as a fundamental stepping-stone towards broader robustness in natural language processing. Previous defense methods capture word substitutions in vector space by using either l_2-ball or hyper-rectangle, which results in perturbation sets that are not inclusive enough or unnecessarily large, and thus impedes mimicry of worst cases for robust training. In this paper, we introduce a novel Adversarial Sparse Convex Combination (ASCC) method. We model the word substitution attack space as a convex hull and leverages a regularization term to enforce perturbation towards an actual substitution, thus aligning our modeling better with the discrete textual space. Based on the ASCC method, we further propose ASCC-defense, which leverages ASCC to generate worst-case perturbations and incorporates adversarial training towards robustness. Experiments show that ASCC-defense outperforms the current state-of-the-arts in terms of robustness on two prevailing NLP tasks, i.e., sentiment analysis and natural language inference, concerning several attacks across multiple model architectures. Besides, we also envision a new class of defense towards robustness in NLP, where our robustly trained word vectors can be plugged into a normally trained model and enforce its robustness without applying any other defense techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/15/2021

RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models

Backdoor attacks, which maliciously control a well-trained model's outpu...
research
06/20/2020

Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood Ensemble

Despite neural networks have achieved prominent performance on many natu...
research
05/08/2021

Certified Robustness to Text Adversarial Attacks by Randomized [MASK]

Recently, few certified defense methods have been developed to provably ...
research
11/30/2018

Adversarial Defense by Stratified Convolutional Sparse Coding

We propose an adversarial defense method that achieves state-of-the-art ...
research
04/08/2022

Backdoor Attack against NLP models with Robustness-Aware Perturbation defense

Backdoor attack intends to embed hidden backdoor into deep neural networ...
research
02/15/2021

Certified Robustness to Programmable Transformations in LSTMs

Deep neural networks for natural language processing are fragile in the ...
research
05/04/2022

Rethinking Classifier and Adversarial Attack

Various defense models have been proposed to resist adversarial attack a...

Please sign up or login with your details

Forgot password? Click here to reset