Relation-Guided Pre-Training for Open-Domain Question Answering

09/21/2021
by   Ziniu Hu, et al.
0

Answering complex open-domain questions requires understanding the latent relations between involving entities. However, we found that the existing QA datasets are extremely imbalanced in some types of relations, which hurts the generalization performance over questions with long-tail relations. To remedy this problem, in this paper, we propose a Relation-Guided Pre-Training (RGPT-QA) framework. We first generate a relational QA dataset covering a wide range of relations from both the Wikidata triplets and Wikipedia hyperlinks. We then pre-train a QA model to infer the latent relations from the question, and then conduct extractive QA to get the target answer entity. We demonstrate that by pretraining with propoed RGPT-QA techique, the popular open-domain QA model, Dense Passage Retriever (DPR), achieves 2.2 improvement in Exact Match accuracy on Natural Questions, TriviaQA, and WebQuestions. Particularly, we show that RGPT-QA improves significantly on questions with long-tail relations

READ FULL TEXT
research
03/22/2021

Open Domain Question Answering over Tables via Dense Retrieval

Recent advances in open-domain QA have led to strong models based on den...
research
01/01/2021

Reader-Guided Passage Reranking for Open-Domain Question Answering

Current open-domain question answering (QA) systems often follow a Retri...
research
03/16/2022

C-MORE: Pretraining to Answer Open-Domain Questions by Consulting Millions of References

We consider the problem of pretraining a two-stage open-domain question ...
research
10/19/2021

DEEPAGÉ: Answering Questions in Portuguese about the Brazilian Environment

The challenge of climate change and biome conservation is one of the mos...
research
09/27/2020

Unsupervised Pre-training for Biomedical Question Answering

We explore the suitability of unsupervised representation learning metho...
research
05/11/2023

Long-Tailed Question Answering in an Open World

Real-world data often have an open long-tailed distribution, and buildin...
research
01/05/2023

SPRING: Situated Conversation Agent Pretrained with Multimodal Questions from Incremental Layout Graph

Existing multimodal conversation agents have shown impressive abilities ...

Please sign up or login with your details

Forgot password? Click here to reset