Challenges in Generalization in Open Domain Question Answering

09/02/2021
by   Linqing Liu, et al.
0

Recent work on Open Domain Question Answering has shown that there is a large discrepancy in model performance between novel test questions and those that largely overlap with training questions. However, it is as of yet unclear which aspects of novel questions that make them challenging. Drawing upon studies on systematic generalization, we introduce and annotate questions according to three categories that measure different levels and kinds of generalization: training set overlap, compositional generalization (comp-gen), and novel entity generalization (novel-entity). When evaluating six popular parametric and non-parametric models, we find that for the established Natural Questions and TriviaQA datasets, even the strongest model performance for comp-gen/novel-entity is 13.1/5.4 full test set – indicating the challenge posed by these types of questions. Furthermore, we show that whilst non-parametric models can handle questions containing novel entities, they struggle with those requiring compositional generalization. Through thorough analysis we find that key question difficulty factors are: cascading errors from the retrieval component, frequency of question pattern, and frequency of the entity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/06/2020

Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets

Ideally Open-Domain Question Answering models should exhibit a number of...
research
11/05/2021

Grounded Graph Decoding Improves Compositional Generalization in Question Answering

Question answering models struggle to generalize to novel compositions o...
research
11/16/2020

Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases

Existing studies on question answering on knowledge bases (KBQA) mainly ...
research
01/19/2023

Reversing The Twenty Questions Game

Twenty questions is a widely popular verbal game. In recent years, many ...
research
04/18/2021

Case-based Reasoning for Natural Language Queries over Knowledge Bases

It is often challenging for a system to solve a new complex problem from...
research
09/10/2021

Entity-Based Knowledge Conflicts in Question Answering

Knowledge-dependent tasks typically use two sources of knowledge: parame...
research
09/17/2021

Simple Entity-Centric Questions Challenge Dense Retrievers

Open-domain question answering has exploded in popularity recently due t...

Please sign up or login with your details

Forgot password? Click here to reset