Exploring Contrast Consistency of Open-Domain Question Answering Systems on Minimally Edited Questions

05/23/2023
by   Zhihan Zhang, et al.
0

Contrast consistency, the ability of a model to make consistently correct predictions in the presence of perturbations, is an essential aspect in NLP. While studied in tasks such as sentiment analysis and reading comprehension, it remains unexplored in open-domain question answering (OpenQA) due to the difficulty of collecting perturbed questions that satisfy factuality requirements. In this work, we collect minimally edited questions as challenging contrast sets to evaluate OpenQA models. Our collection approach combines both human annotation and large language model generation. We find that the widely used dense passage retriever (DPR) performs poorly on our contrast sets, despite fitting the training set well and performing competitively on standard test sets. To address this issue, we introduce a simple and effective query-side contrastive loss with the aid of data augmentation to improve DPR training. Our experiments on the contrast sets demonstrate that DPR's contrast consistency is improved without sacrificing its accuracy on the standard test sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2017

The NarrativeQA Reading Comprehension Challenge

Reading comprehension (RC)---in contrast to information retrieval---requ...
research
03/17/2021

Automatic Generation of Contrast Sets from Scene Graphs: Probing the Compositional Consistency of GQA

Recent works have shown that supervised models often exploit data artifa...
research
04/06/2020

Evaluating NLP Models via Contrast Sets

Standard test sets for supervised learning evaluate in-distribution gene...
research
10/14/2021

Retrieval-guided Counterfactual Generation for QA

Deep NLP models have been shown to learn spurious correlations, leaving ...
research
08/06/2020

Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets

Ideally Open-Domain Question Answering models should exhibit a number of...
research
04/21/2020

Logic-Guided Data Augmentation and Regularization for Consistent Question Answering

Many natural language questions require qualitative, quantitative or log...

Please sign up or login with your details

Forgot password? Click here to reset