Defense of Word-level Adversarial Attacks via Random Substitution Encoding

05/01/2020
by   Zhaoyang Wang, et al.
0

The adversarial attacks against deep neural networks on computer version tasks has spawned many new technologies that help protect models avoiding false prediction. Recently, word-level adversarial attacks on deep models of Natural Language Processing (NLP) tasks have also demonstrated strong power, e.g., fooling a sentiment classification neural network to make wrong decision. Unfortunately, few previous literatures have discussed the defense of such word-level synonym substitution based attacks since they are hard to be perceived and detected. In this paper, we shed light on this problem and propose a novel defense framework called Random Substitution Encoding (RSE), which introduces a random substitution encoder into the training process of original neural networks. Extensive experiments on text classification tasks demonstrate the effectiveness of our framework on defense of word-level adversarial attacks, under various base and attack models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2019

Natural Language Adversarial Attacks and Defenses in Word Level

Up until recent two years, inspired by the big amount of research about ...
research
03/12/2022

A Survey in Adversarial Defences and Robustness in NLP

In recent years, it has been seen that deep neural networks are lacking ...
research
03/27/2022

Rebuild and Ensemble: Exploring Defense Against Text Adversaries

Adversarial attacks can mislead strong neural models; as such, in NLP ta...
research
02/11/2022

Using Random Perturbations to Mitigate Adversarial Attacks on Sentiment Analysis Models

Attacks on deep learning models are often difficult to identify and ther...
research
10/15/2019

MUTE: Data-Similarity Driven Multi-hot Target Encoding for Neural Network Design

Target encoding is an effective technique to deliver better performance ...
research
02/12/2023

TextDefense: Adversarial Text Detection based on Word Importance Entropy

Currently, natural language processing (NLP) models are wildly used in v...
research
04/30/2021

Deep Image Destruction: A Comprehensive Study on Vulnerability of Deep Image-to-Image Models against Adversarial Attacks

Recently, the vulnerability of deep image classification models to adver...

Please sign up or login with your details

Forgot password? Click here to reset