DeepAI
Log In Sign Up

Generation-Augmented Retrieval for Open-domain Question Answering

09/17/2020
by   Yuning Mao, et al.
0

Conventional sparse retrieval methods such as TF-IDF and BM25 are simple and efficient, but solely rely on lexical overlap without semantic matching. Recent dense retrieval methods learn latent representations to tackle the lexical mismatch problem, while being more computationally expensive and insufficient for exact matching as they embed the text sequence into a single vector with limited capacity. In this paper, we present Generation-Augmented Retrieval (GAR), a query expansion method that augments a query with relevant contexts through text generation. We demonstrate on open-domain question answering that the generated contexts significantly enrich the semantics of the queries and thus GAR with sparse representations (BM25) achieves comparable or better performance than the state-of-the-art dense methods such as DPR <cit.>. We show that generating various contexts of a query is beneficial as fusing their results consistently yields better retrieval accuracy. Moreover, as sparse and dense representations are often complementary, GAR can be easily combined with DPR to achieve even better performance. Furthermore, GAR achieves the state-of-the-art performance on the Natural Questions and TriviaQA datasets under the extractive setting when equipped with an extractive reader, and consistently outperforms other retrieval methods when the same generative reader is used.

READ FULL TEXT

page 1

page 2

page 3

page 4

09/28/2020

SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval

We introduce SPARTA, a novel neural retrieval method that shows great pr...
10/13/2021

Salient Phrase Aware Dense Retrieval: Can a Dense Retriever Imitate a Sparse One?

Despite their recent popularity and well known advantages, dense retriev...
09/23/2021

Towards Universal Dense Retrieval for Open-domain Question Answering

In open-domain question answering, a model receives a text question as i...
03/09/2018

An Unsupervised Model with Attention Autoencoders for Question Retrieval

Question retrieval is a crucial subtask for community question answering...
09/27/2020

Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval

We propose a simple and efficient multi-hop dense retrieval approach for...
10/13/2022

Query Expansion Using Contextual Clue Sampling with Language Models

Query expansion is an effective approach for mitigating vocabulary misma...
09/28/2022

FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation

Retrieval-augmented generation models offer many benefits over standalon...