Improving the Adversarial Robustness of NLP Models by Information Bottleneck

06/11/2022
by   Cenyuan Zhang, et al.
0

Existing studies have demonstrated that adversarial examples can be directly attributed to the presence of non-robust features, which are highly predictive, but can be easily manipulated by adversaries to fool NLP models. In this study, we explore the feasibility of capturing task-specific robust features, while eliminating the non-robust ones by using the information bottleneck theory. Through extensive experiments, we show that the models trained with our information bottleneck-based method are able to achieve a significant improvement in robust accuracy, exceeding performances of all the previously reported defense methods while suffering almost no performance drop in clean accuracy on SST-2, AGNEWS and IMDB datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/06/2019

Adversarial Examples Are Not Bugs, They Are Features

Adversarial examples have attracted significant attention in machine lea...
research
10/25/2022

Causal Information Bottleneck Boosts Adversarial Robustness of Deep Neural Network

The information bottleneck (IB) method is a feasible defense solution ag...
research
07/12/2021

A Closer Look at the Adversarial Robustness of Information Bottleneck Models

We study the adversarial robustness of information bottleneck models for...
research
02/13/2020

The Conditional Entropy Bottleneck

Much of the field of Machine Learning exhibits a prominent set of failur...
research
06/04/2021

Revisiting Hilbert-Schmidt Information Bottleneck for Adversarial Robustness

We investigate the HSIC (Hilbert-Schmidt independence criterion) bottlen...
research
04/24/2020

RAIN: Robust and Accurate Classification Networks with Randomization and Enhancement

Along with the extensive applications of CNN models for classification, ...
research
10/15/2019

Extracting robust and accurate features via a robust information bottleneck

We propose a novel strategy for extracting features in supervised learni...

Please sign up or login with your details

Forgot password? Click here to reset