InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

10/05/2020
by   Boxin Wang, et al.
0

Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks. Recent studies, however, show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks. We aim to address this problem from an information-theoretic perspective, and propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models. InfoBERT contains two mutual-information-based regularizers for model training: (i) an Information Bottleneck regularizer, which suppresses noisy mutual information between the input and the feature representation; and (ii) a Robust Feature regularizer, which increases the mutual information between local robust features and global features. We provide a principled way to theoretically analyze and improve the robustness of representation learning for language models in both standard and adversarial training. Extensive experiments demonstrate that InfoBERT achieves state-of-the-art robust accuracy over several adversarial datasets on Natural Language Inference (NLI) and Question Answering (QA) tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/22/2021

How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?

The fine-tuning of pre-trained language models has a great success in ma...
research
10/26/2022

Disentangled Text Representation Learning with Information-Theoretic Perspective for Adversarial Robustness

Adversarial vulnerability remains a major obstacle to constructing relia...
research
03/21/2022

An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels

Pre-trained language models derive substantial linguistic and factual kn...
research
05/31/2019

Information Minimization In Emergent Languages

There is a growing interest in studying the languages emerging when neur...
research
04/28/2022

Improving robustness of language models from a geometry-aware perspective

Recent studies have found that removing the norm-bounded projection and ...
research
09/21/2020

Improving Robustness and Generality of NLP Models Using Disentangled Representations

Supervised neural networks, which first map an input x to a single repre...
research
12/04/2020

Unsupervised Adversarially-Robust Representation Learning on Graphs

Recent works have demonstrated that deep learning on graphs is vulnerabl...

Please sign up or login with your details

Forgot password? Click here to reset