A Data Cartography based MixUp for Pre-trained Language Models

05/06/2022
by   Seo Yeon Park, et al.
0

MixUp is a data augmentation strategy where additional samples are generated during training by combining random pairs of training samples and their labels. However, selecting random pairs is not potentially an optimal choice. In this work, we propose TDMixUp, a novel MixUp strategy that leverages Training Dynamics and allows more informative samples to be combined for generating new data samples. Our proposed TDMixUp first measures confidence, variability, (Swayamdipta et al., 2020), and Area Under the Margin (AUM) (Pleiss et al., 2020) to identify the characteristics of training samples (e.g., as easy-to-learn or ambiguous samples), and then interpolates these characterized samples. We empirically validate that our method not only achieves competitive performance using a smaller subset of the training data compared with strong baselines, but also yields lower expected calibration error on the pre-trained language model, BERT, on both in-domain and out-of-domain settings in a wide range of NLP tasks. We publicly release our code.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/14/2022

On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency

A well-calibrated neural model produces confidence (probability outputs)...
research
05/20/2020

BERTweet: A pre-trained language model for English Tweets

We present BERTweet, the first public large-scale pre-trained language m...
research
07/10/2021

Noise Stability Regularization for Improving BERT Fine-tuning

Fine-tuning pre-trained language models such as BERT has become a common...
research
04/12/2021

Factual Probing Is [MASK]: Learning vs. Learning to Recall

Petroni et al. (2019) demonstrated that it is possible to retrieve world...
research
07/21/2023

Making Pre-trained Language Models both Task-solvers and Self-calibrators

Pre-trained language models (PLMs) serve as backbones for various real-w...
research
09/22/2021

BFClass: A Backdoor-free Text Classification Framework

Backdoor attack introduces artificial vulnerabilities into the model by ...
research
10/31/2021

PnPOOD : Out-Of-Distribution Detection for Text Classification via Plug andPlay Data Augmentation

While Out-of-distribution (OOD) detection has been well explored in comp...

Please sign up or login with your details

Forgot password? Click here to reset