Towards Robustness Against Natural Language Word Substitutions

07/28/2021
by   Xinshuai Dong, et al.
0

Robustness against word substitutions has a well-defined and widely acceptable form, i.e., using semantically similar words as substitutions, and thus it is considered as a fundamental stepping-stone towards broader robustness in natural language processing. Previous defense methods capture word substitutions in vector space by using either l_2-ball or hyper-rectangle, which results in perturbation sets that are not inclusive enough or unnecessarily large, and thus impedes mimicry of worst cases for robust training. In this paper, we introduce a novel Adversarial Sparse Convex Combination (ASCC) method. We model the word substitution attack space as a convex hull and leverages a regularization term to enforce perturbation towards an actual substitution, thus aligning our modeling better with the discrete textual space. Based on the ASCC method, we further propose ASCC-defense, which leverages ASCC to generate worst-case perturbations and incorporates adversarial training towards robustness. Experiments show that ASCC-defense outperforms the current state-of-the-arts in terms of robustness on two prevailing NLP tasks, i.e., sentiment analysis and natural language inference, concerning several attacks across multiple model architectures. Besides, we also envision a new class of defense towards robustness in NLP, where our robustly trained word vectors can be plugged into a normally trained model and enforce its robustness without applying any other defense techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

10/15/2021

RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models

Backdoor attacks, which maliciously control a well-trained model's outpu...
06/20/2020

Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood Ensemble

Despite neural networks have achieved prominent performance on many natu...
05/08/2021

Certified Robustness to Text Adversarial Attacks by Randomized [MASK]

Recently, few certified defense methods have been developed to provably ...
09/25/2020

Attention Meets Perturbations: Robust and Interpretable Attention with Adversarial Training

In recent years, deep learning models have placed more emphasis on the i...
12/13/2021

The King is Naked: on the Notion of Robustness for Natural Language Processing

There is growing evidence that the classical notion of adversarial robus...
06/10/2016

WordNet2Vec: Corpora Agnostic Word Vectorization Method

A complex nature of big data resources demands new methods for structuri...
02/15/2021

Certified Robustness to Programmable Transformations in LSTMs

Deep neural networks for natural language processing are fragile in the ...