Generating (Formulaic) Text by Splicing Together Nearest Neighbors

01/20/2021
by   Sam Wiseman, et al.
0

We propose to tackle conditional text generation tasks, especially those which require generating formulaic text, by splicing together segments of text from retrieved "neighbor" source-target pairs. Unlike recent work that conditions on retrieved neighbors in an encoder-decoder setting but generates text token-by-token, left-to-right, we learn a policy that directly manipulates segments of neighbor text (i.e., by inserting or replacing them) to form an output. Standard techniques for training such a policy require an oracle derivation for each generation, and we prove that finding the shortest such derivation can be reduced to parsing under a particular weighted context-free grammar. We find that policies learned in this way allow for interpretable table-to-text or headline generation that is competitive with neighbor-based token-level policies on automatic metrics, though on all but one dataset neighbor-based policies underperform a strong neighborless baseline. In all cases, however, generating by splicing is faster.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/05/2019

Non-Monotonic Sequential Text Generation

Standard sequential generation methods assume a pre-specified generation...
research
08/26/2022

Nearest Neighbor Non-autoregressive Text Generation

Non-autoregressive (NAR) models can generate sentences with less computa...
research
03/01/2022

Attend, Memorize and Generate: Towards Faithful Table-to-Text Generation in Few Shots

Few-shot table-to-text generation is a task of composing fluent and fait...
research
10/22/2022

Information-Transport-based Policy for Simultaneous Translation

Simultaneous translation (ST) outputs translation while receiving the so...
research
08/12/2019

Neural Text Generation with Unlikelihood Training

Neural text generation is a key tool in natural language applications, b...
research
04/18/2021

A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation

Large pretrained generative models like GPT-3 often suffer from hallucin...
research
10/22/2021

Simple Dialogue System with AUDITED

We devise a multimodal conversation system for dialogue utterances compo...

Please sign up or login with your details

Forgot password? Click here to reset