Hindsight: Posterior-guided training of retrievers for improved open-ended generation

10/14/2021
by   Ashwin Paranjape, et al.
1

Many text generation systems benefit from using a retriever to retrieve passages from a textual knowledge corpus (e.g., Wikipedia) which are then provided as additional context to the generator. For open-ended generation tasks (like generating informative utterances in conversations) many varied passages may be equally relevant and we find that existing methods that jointly train the retriever and generator underperform: the retriever may not find relevant passages even amongst the top-10 and hence the generator may not learn a preference to ground its generated output in them. We propose using an additional guide retriever that is allowed to use the target output and "in hindsight" retrieve relevant passages during training. We model the guide retriever after the posterior distribution Q of passages given the input and the target output and train it jointly with the standard retriever and the generator by maximizing the evidence lower bound (ELBo) in expectation over Q. For informative conversations from the Wizard of Wikipedia dataset, with posterior-guided training, the retriever finds passages with higher relevance in the top-10 (23 grounded in the retrieved passage (19 system produces better overall output (6.4

READ FULL TEXT
research
05/14/2021

Joint Retrieval and Generation Training for Grounded Text Generation

Recent advances in large-scale pre-training such as GPT-3 allow seemingl...
research
09/19/2018

A Dataset for Document Grounded Conversations

This paper introduces a document grounded dataset for text conversations...
research
04/26/2021

Focused Attention Improves Document-Grounded Generation

Document grounded generation is the task of using the information provid...
research
06/29/2021

TWAG: A Topic-Guided Wikipedia Abstract Generator

Wikipedia abstract generation aims to distill a Wikipedia abstract from ...
research
06/28/2022

Joint Generator-Ranker Learning for Natural Language Generation

Due to exposure bias, most existing natural language generation (NLG) mo...
research
03/23/2023

TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision

In this paper, we investigate an open research task of generating contro...
research
05/21/2018

Turbo Learning for Captionbot and Drawingbot

We study in this paper the problems of both image captioning and text-to...

Please sign up or login with your details

Forgot password? Click here to reset