RankGen: Improving Text Generation with Large Ranking Models

05/19/2022
by   Kalpesh Krishna, et al.
0

Given an input sequence (or prefix), modern language models often assign high probabilities to output sequences that are repetitive, incoherent, or irrelevant to the prefix; as such, model-generated text also contains such artifacts. To address these issues, we present RankGen, an encoder model (1.2B parameters) that scores model generations given a prefix. RankGen can be flexibly incorporated as a scoring function in beam search and used to decode from any pretrained language model. We train RankGen using large-scale contrastive learning to map a prefix close to the ground-truth sequence that follows it and far away from two types of negatives: (1) random sequences from the same document as the prefix, and, which discourage topically-similar but irrelevant generations; (2) sequences generated from a large language model conditioned on the prefix, which discourage repetition and hallucination. Experiments across four different language models (345M-11B parameters) and two domains show that RankGen significantly outperforms decoding algorithms like nucleus, top-k, and typical sampling on both automatic metrics (85.0 vs 77.3 MAUVE) as well as human evaluations with English writers (74.5 preference over nucleus sampling). Analysis reveals that RankGen outputs are more relevant to the prefix and improve continuity and coherence compared to baselines. We open source our model checkpoints, code, and human preferences with detailed explanations for future research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2022

A Contrastive Framework for Neural Text Generation

Text generation is of great importance to many natural language processi...
research
03/19/2021

Controllable Generation from Pre-trained Language Models via Inverse Prompting

Large-scale pre-trained language models have demonstrated strong capabil...
research
11/16/2022

Towards Computationally Verifiable Semantic Grounding for Language Models

The paper presents an approach to semantic grounding of language models ...
research
01/31/2023

Large Language Models Can Be Easily Distracted by Irrelevant Context

Large language models have achieved impressive performance on various na...
research
09/30/2022

Calibrating Sequence likelihood Improves Conditional Language Generation

Conditional language models are predominantly trained with maximum likel...
research
07/08/2022

Hidden Schema Networks

Most modern language models infer representations that, albeit powerful,...
research
02/06/2020

Consistency of a Recurrent Language Model With Respect to Incomplete Decoding

Despite strong performance on a variety of tasks, neural sequence models...

Please sign up or login with your details

Forgot password? Click here to reset