Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

10/22/2020
by   Lingkai Kong, et al.
2

Fine-tuned pre-trained language models can suffer from severe miscalibration for both in-distribution and out-of-distribution (OOD) data due to over-parameterization. To mitigate this issue, we propose a regularized fine-tuning method. Our method introduces two types of regularization for better calibration: (1) On-manifold regularization, which generates pseudo on-manifold samples through interpolation within the data manifold. Augmented training with these pseudo samples imposes a smoothness regularization to improve in-distribution calibration. (2) Off-manifold regularization, which encourages the model to output uniform distributions for pseudo off-manifold samples to address the over-confidence issue for OOD data. Our experiments demonstrate that the proposed method outperforms existing calibration methods for text classification in terms of expectation calibration error, misclassification detection, and OOD detection on six datasets. Our code can be found at https://github.com/Lingkai-Kong/Calibrated-BERT-Fine-Tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2022

Embedding Hallucination for Few-Shot Language Fine-tuning

Few-shot language learners adapt knowledge from a pre-trained model to r...
research
05/22/2023

Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection

Out-of-distribution (OOD) detection is a critical task for reliable pred...
research
06/12/2022

Fine-tuning Pre-trained Language Models with Noise Stability Regularization

The advent of large-scale pre-trained language models has contributed gr...
research
12/17/2020

MASKER: Masked Keyword Regularization for Reliable Text Classification

Pre-trained language models have achieved state-of-the-art accuracies on...
research
10/17/2022

Pseudo-OOD training for robust language models

While pre-trained large-scale deep models have garnered attention as an ...
research
08/27/2021

Exploring the Capacity of a Large-scale Masked Language Model to Recognize Grammatical Errors

In this paper, we explore the capacity of a language model-based method ...
research
07/21/2023

Making Pre-trained Language Models both Task-solvers and Self-calibrators

Pre-trained language models (PLMs) serve as backbones for various real-w...

Please sign up or login with your details

Forgot password? Click here to reset