FreeLB: Enhanced Adversarial Training for Language Understanding

09/25/2019
by   Chen Zhu, et al.
0

Adversarial training, which minimizes the maximal risk for label-preserving input perturbations, has proved to be effective for improving the generalization of language models. In this work, we propose a novel adversarial training algorithm - FreeLB, that promotes higher robustness and invariance in the embedding space, by adding adversarial perturbations to word embeddings and minimizing the resultant adversarial risk inside different regions around input samples. To validate the effectiveness of the proposed approach, we apply it to Transformer-based models for natural language understanding and commonsense reasoning tasks. Experiments on the GLUE benchmark show that when applied only to the finetuning stage, it is able to improve the overall test scores of BERT-based model from 78.3 to 79.4, and RoBERTa-large model from 88.5 to 88.8. In addition, the proposed approach achieves state-of-the-art test accuracies of 85.39% and 67.32% on ARC-Easy and ARC-Challenge. Experiments on CommonsenseQA benchmark further demonstrate that FreeLB can be generalized and boost the performance of RoBERTa-large model on other tasks as well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2023

DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization

Adversarial training is one of the best-performing methods in improving ...
research
04/30/2020

TextAT: Adversarial Training for Natural Language Understanding with Token-Level Perturbation

Adversarial training is effective in improving the robustness of neural ...
research
04/12/2021

Targeted Adversarial Training for Natural Language Understanding

We present a simple yet effective Targeted Adversarial Training (TAT) al...
research
09/29/2020

A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation

Adversarial training has been shown effective at endowing the learned re...
research
06/11/2020

Large-Scale Adversarial Training for Vision-and-Language Representation Learning

We present VILLA, the first known effort on large-scale adversarial trai...
research
05/17/2020

Adversarial Training for Commonsense Inference

We propose an AdversariaL training algorithm for commonsense InferenCE (...
research
04/11/2021

Adversarial Training as Stackelberg Game: An Unrolled Optimization Approach

Adversarial training has been shown to improve the generalization perfor...

Please sign up or login with your details

Forgot password? Click here to reset