Robust Question Answering against Distribution Shifts with Test-Time Adaptation: An Empirical Study

02/09/2023
by   Hai Ye, et al.
0

A deployed question answering (QA) model can easily fail when the test data has a distribution shift compared to the training data. Robustness tuning (RT) methods have been widely studied to enhance model robustness against distribution shifts before model deployment. However, can we improve a model after deployment? To answer this question, we evaluate test-time adaptation (TTA) to improve a model after deployment. We first introduce COLDQA, a unified evaluation benchmark for robust QA against text corruption and changes in language and domain. We then evaluate previous TTA methods on COLDQA and compare them to RT methods. We also propose a novel TTA method called online imitation learning (OIL). Through extensive experiments, we find that TTA is comparable to RT methods, and applying TTA after RT can significantly boost the performance on COLDQA. Our proposed OIL improves TTA to be more robust to variation in hyper-parameters and test distributions over time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2023

Generative Data Augmentation using LLMs improves Distributional Robustness in Question Answering

Robustness in Natural Language Processing continues to be a pertinent is...
research
12/20/2022

To Adapt or to Annotate: Challenges and Interventions for Domain Adaptation in Open-Domain Question Answering

Recent advances in open-domain question answering (ODQA) have demonstrat...
research
11/02/2022

Continual Conscious Active Fine-Tuning to Robustify Online Machine Learning Models Against Data Distribution Shifts

Unlike their offline traditional counterpart, online machine learning mo...
research
02/07/2017

Semi-Supervised QA with Generative Domain-Adaptive Nets

We study the problem of semi-supervised question answering----utilizing ...
research
06/26/2020

Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?

Out-of-training-distribution (OOD) scenarios are a common challenge of l...
research
04/29/2020

The Effect of Natural Distribution Shift on Question Answering Models

We build four new test sets for the Stanford Question Answering Dataset ...
research
11/21/2022

MATE: Masked Autoencoders are Online 3D Test-Time Learners

We propose MATE, the first Test-Time-Training (TTT) method designed for ...

Please sign up or login with your details

Forgot password? Click here to reset