Generate rather than Retrieve: Large Language Models are Strong Context Generators

09/21/2022
by   Wenhao Yu, et al.
0

Knowledge-intensive tasks, such as open-domain question answering (QA), require access to a large amount of world or domain knowledge. A common approach for knowledge-intensive tasks is to employ a retrieve-then-read pipeline that first retrieves a handful of relevant contextual documents from an external corpus such as Wikipedia and then predicts an answer conditioned on the retrieved documents. In this paper, we present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators. We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer. Furthermore, we propose a novel clustering-based prompting method that selects distinct prompts, resulting in the generated documents that cover different perspectives, leading to better recall over acceptable answers. We conduct extensive experiments on three different knowledge-intensive tasks, including open-domain QA, fact checking, and dialogue system. Notably, GenRead achieves 71.6 and 54.4 exact match scores on TriviaQA and WebQ, significantly outperforming the state-of-the-art retrieve-then-read pipeline DPR-FiD by +4.0 and +3.9, without retrieving any documents from any external knowledge source. Lastly, we demonstrate the model performance can be further improved by combining retrieval and generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2023

Generator-Retriever-Generator: A Novel Approach to Open-domain Question Answering

Open-domain question answering (QA) tasks usually require the retrieval ...
research
12/16/2022

Self-Prompting Large Language Models for Open-Domain QA

Open-Domain Question Answering (ODQA) requires models to answer factoid ...
research
05/23/2023

Query Rewriting for Retrieval-Augmented Large Language Models

Large Language Models (LLMs) play a powerful Reader of the Retrieve-then...
research
05/29/2023

GripRank: Bridging the Gap between Retrieval and Generation via the Generative Knowledge Improved Passage Ranking

Retrieval-enhanced text generation, which aims to leverage passages retr...
research
12/28/2022

Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP

Retrieval-augmented in-context learning has emerged as a powerful approa...
research
04/15/2021

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering

In open-domain question answering (QA), retrieve-and-read mechanism has ...
research
02/21/2021

Pruning the Index Contents for Memory Efficient Open-Domain QA

This work presents a novel pipeline that demonstrates what is achievable...

Please sign up or login with your details

Forgot password? Click here to reset