MASKER: Masked Keyword Regularization for Reliable Text Classification

12/17/2020
by   Seung Jun Moon, et al.
0

Pre-trained language models have achieved state-of-the-art accuracies on various text classification tasks, e.g., sentiment analysis, natural language inference, and semantic textual similarity. However, the reliability of the fine-tuned text classifiers is an often underlooked performance criterion. For instance, one may desire a model that can detect out-of-distribution (OOD) samples (drawn far from training distribution) or be robust against domain shifts. We claim that one central obstacle to the reliability is the over-reliance of the model on a limited number of keywords, instead of looking at the whole context. In particular, we find that (a) OOD samples often contain in-distribution keywords, while (b) cross-domain samples may not always contain keywords; over-relying on the keywords can be problematic for both cases. In light of this observation, we propose a simple yet effective fine-tuning method, coined masked keyword regularization (MASKER), that facilitates context-based prediction. MASKER regularizes the model to reconstruct the keywords from the rest of the words and make low-confidence predictions without enough context. When applied to various pre-trained language models (e.g., BERT, RoBERTa, and ALBERT), we demonstrate that MASKER improves OOD detection and cross-domain generalization without degrading classification accuracy. Code is available at https://github.com/alinlab/MASKER.

READ FULL TEXT
research
05/22/2023

Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection

Out-of-distribution (OOD) detection is a critical task for reliable pred...
research
06/12/2022

Fine-tuning Pre-trained Language Models with Noise Stability Regularization

The advent of large-scale pre-trained language models has contributed gr...
research
10/22/2020

Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

Fine-tuned pre-trained language models can suffer from severe miscalibra...
research
06/06/2023

CUE: An Uncertainty Interpretation Framework for Text Classifiers Built on Pre-Trained Language Models

Text classifiers built on Pre-trained Language Models (PLMs) have achiev...
research
05/24/2023

ChatAgri: Exploring Potentials of ChatGPT on Cross-linguistic Agricultural Text Classification

In the era of sustainable smart agriculture, a massive amount of agricul...
research
12/15/2022

FreCDo: A Large Corpus for French Cross-Domain Dialect Identification

We present a novel corpus for French dialect identification comprising 4...
research
10/17/2022

Pseudo-OOD training for robust language models

While pre-trained large-scale deep models have garnered attention as an ...

Please sign up or login with your details

Forgot password? Click here to reset