Enhancing Model Robustness By Incorporating Adversarial Knowledge Into Semantic Representation

02/23/2021
by   Jinfeng Li, et al.
0

Despite that deep neural networks (DNNs) have achieved enormous success in many domains like natural language processing (NLP), they have also been proven to be vulnerable to maliciously generated adversarial examples. Such inherent vulnerability has threatened various real-world deployed DNNs-based applications. To strength the model robustness, several countermeasures have been proposed in the English NLP domain and obtained satisfactory performance. However, due to the unique language properties of Chinese, it is not trivial to extend existing defenses to the Chinese domain. Therefore, we propose AdvGraph, a novel defense which enhances the robustness of Chinese-based NLP models by incorporating adversarial knowledge into the semantic representation of the input. Extensive experiments on two real-world tasks show that AdvGraph exhibits better performance compared with previous work: (i) effective - it significantly strengthens the model robustness even under the adaptive attacks setting without negative impact on model performance over legitimate input; (ii) generic - its key component, i.e., the representation of connotative adversarial knowledge is task-agnostic, which can be reused in any Chinese-based NLP models without retraining; and (iii) efficient - it is a light-weight defense with sub-linear computational complexity, which can guarantee the efficiency required in practical scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2023

Expanding Scope: Adapting English Adversarial Attacks to Chinese

Recent studies have revealed that NLP predictive models are vulnerable t...
research
06/20/2020

Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood Ensemble

Despite neural networks have achieved prominent performance on many natu...
research
04/07/2020

Towards Evaluating the Robustness of Chinese BERT Classifiers

Recent advances in large-scale language representation models such as BE...
research
09/29/2022

Generalizability of Adversarial Robustness Under Distribution Shifts

Recent progress in empirical and certified robustness promises to delive...
research
06/08/2020

A Self-supervised Approach for Adversarial Robustness

Adversarial examples can cause catastrophic mistakes in Deep Neural Netw...
research
07/31/2021

Adversarial Robustness of Deep Code Comment Generation

Deep neural networks (DNNs) have shown remarkable performance in a varie...
research
12/16/2021

DuQM: A Chinese Dataset of Linguistically Perturbed Natural Questions for Evaluating the Robustness of Question Matching Models

In this paper, we focus on studying robustness evaluation of Chinese que...

Please sign up or login with your details

Forgot password? Click here to reset