Improving Pre-trained Language Models' Generalization

07/19/2023
by   Somayeh Ghanbarzadeh, et al.
0

The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their generalization problem, where their performance drastically decreases when evaluated on examples that differ from the training dataset, known as Out-of-Distribution (OOD)/unseen examples. This limitation arises from PLMs' reliance on spurious correlations, which work well for frequent example types but not for general examples. To address this issue, we propose a training approach called Mask-tuning, which integrates Masked Language Modeling (MLM) training objectives into the fine-tuning process to enhance PLMs' generalization. Comprehensive experiments demonstrate that Mask-tuning surpasses current state-of-the-art techniques and enhances PLMs' generalization on OOD datasets while improving their performance on in-distribution datasets. The findings suggest that Mask-tuning improves the reusability of PLMs on unseen data, making them more practical and effective for real-world applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2023

Better Language Models of Code through Self-Improvement

Pre-trained language models for code (PLMCs) have gained attention in re...
research
12/13/2022

Localized Latent Updates for Fine-Tuning Vision-Language Models

Although massive pre-trained vision-language models like CLIP show impre...
research
07/14/2020

An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models

Recent work has shown that pre-trained language models such as BERT impr...
research
06/07/2023

Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions

Societal biases present in pre-trained large language models are a criti...
research
05/18/2023

Ahead-of-Time P-Tuning

In this paper, we propose Ahead-of-Time (AoT) P-Tuning, a novel paramete...
research
09/03/2023

BDC-Adapter: Brownian Distance Covariance for Better Vision-Language Reasoning

Large-scale pre-trained Vision-Language Models (VLMs), such as CLIP and ...
research
06/30/2021

The Evolution of Out-of-Distribution Robustness Throughout Fine-Tuning

Although machine learning models typically experience a drop in performa...

Please sign up or login with your details

Forgot password? Click here to reset