Improving QA Generalization by Concurrent Modeling of Multiple Biases

10/07/2020
by   Mingzhu Wu, et al.
0

Existing NLP datasets contain various biases that models can easily exploit to achieve high performances on the corresponding evaluation sets. However, focusing on dataset-specific biases limits their ability to learn more generalizable knowledge about the task from more general data patterns. In this paper, we investigate the impact of debiasing methods for improving generalization and propose a general framework for improving the performance on both in-domain and out-of-domain datasets by concurrent modeling of multiple biases in the training data. Our framework weights each example based on the biases it contains and the strength of those biases in the training data. It then uses these weights in the training objective so that the model relies less on examples with high bias weights. We extensively evaluate our framework on extractive question answering with training data from various domains with multiple biases of different strengths. We perform the evaluations in two different settings, in which the model is trained on a single domain or multiple domains simultaneously, and show its effectiveness in both settings compared to state-of-the-art debiasing methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2020

Improving Robustness by Augmenting Training Sentences with Predicate-Argument Structures

Existing NLP datasets contain various biases, and models tend to quickly...
research
02/28/2023

SMoA: Sparse Mixture of Adapters to Mitigate Multiple Dataset Biases

Recent studies reveal that various biases exist in different NLP tasks, ...
research
09/25/2020

Towards Debiasing NLU Models from Unknown Biases

NLU models often exploit biases to achieve high dataset-specific perform...
research
09/09/2019

Don't Take the Easy Way Out: Ensemble Based Methods for Avoiding Known Dataset Biases

State-of-the-art models often make use of superficial patterns in the da...
research
10/06/2020

UNQOVERing Stereotyping Biases via Underspecified Questions

While language embeddings have been shown to have stereotyping biases, h...
research
05/11/2023

Think Twice: Measuring the Efficiency of Eliminating Prediction Shortcuts of Question Answering Models

While the Large Language Models (LLMs) dominate a majority of language u...
research
11/07/2020

Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles

Many datasets have been shown to contain incidental correlations created...

Please sign up or login with your details

Forgot password? Click here to reset