Addressing Distribution Shift at Test Time in Pre-trained Language Models

12/05/2022
by   Ayush Singh, et al.
0

State-of-the-art pre-trained language models (PLMs) outperform other models when applied to the majority of language processing tasks. However, PLMs have been found to degrade in performance under distribution shift, a phenomenon that occurs when data at test-time does not come from the same distribution as the source training set. Equally as challenging is the task of obtaining labels in real-time due to issues like long-labeling feedback loops. The lack of adequate methods that address the aforementioned challenges constitutes the need for approaches that continuously adapt the PLM to a distinct distribution. Unsupervised domain adaptation adapts a source model to an unseen as well as unlabeled target domain. While some techniques such as data augmentation can adapt models in several scenarios, they have only been sparsely studied for addressing the distribution shift problem. In this work, we present an approach (MEMO-CL) that improves the performance of PLMs at test-time under distribution shift. Our approach takes advantage of the latest unsupervised techniques in data augmentation and adaptation to minimize the entropy of the PLM's output distribution. MEMO-CL operates on a batch of augmented samples from a single observation in the test set. The technique introduced is unsupervised, domain-agnostic, easy to implement, and requires no additional data. Our experiments result in a 3 baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2022

Few-Shot Adaptation of Pre-Trained Networks for Domain Shift

Deep networks are prone to performance degradation when there is a domai...
research
10/05/2020

Test-time Unsupervised Domain Adaptation

Convolutional neural networks trained on publicly available medical imag...
research
08/11/2023

Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning

Benefiting from prompt tuning, recent years have witnessed the promising...
research
03/30/2022

Learning Instance-Specific Adaptation for Cross-Domain Segmentation

We propose a test-time adaptation method for cross-domain image segmenta...
research
03/05/2017

A Theory of Output-Side Unsupervised Domain Adaptation

When learning a mapping from an input space to an output space, the assu...
research
11/28/2022

Beyond Invariance: Test-Time Label-Shift Adaptation for Distributions with "Spurious" Correlations

Spurious correlations, or correlations that change across domains where ...
research
06/28/2021

Test-Time Adaptation to Distribution Shift by Confidence Maximization and Input Transformation

Deep neural networks often exhibit poor performance on data that is unli...

Please sign up or login with your details

Forgot password? Click here to reset