Calibrating Sequence likelihood Improves Conditional Language Generation

09/30/2022
by   Yao Zhao, et al.
4

Conditional language models are predominantly trained with maximum likelihood estimation (MLE), giving probability mass to sparsely observed target sequences. While MLE trained models assign high probability to plausible sequences given the context, the model probabilities often do not accurately rank-order generated sequences by quality. This has been empirically observed in beam search decoding as output quality degrading with large beam sizes, and decoding strategies benefiting from heuristics such as length normalization and repetition-blocking. In this work, we introduce sequence likelihood calibration (SLiC) where the likelihood of model generated sequences are calibrated to better align with reference sequences in the model's latent space. With SLiC, decoding heuristics become unnecessary and decoding candidates' quality significantly improves regardless of the decoding method. Furthermore, SLiC shows no sign of diminishing returns with model scale, and presents alternative ways to improve quality with limited training and inference budgets. With SLiC, we exceed or match SOTA results on a wide range of generation tasks spanning abstractive summarization, question generation, abstractive question answering and data-to-text generation, even with modest-sized models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/26/2023

Tailoring Language Generation Models under Total Variation Distance

The standard paradigm of neural language generation adopts maximum likel...
research
06/14/2019

Comparison of Diverse Decoding Methods from Conditional Language Models

While conditional language models have greatly improved in their ability...
research
02/06/2020

Consistency of a Recurrent Language Model With Respect to Incomplete Decoding

Despite strong performance on a variety of tasks, neural sequence models...
research
10/06/2020

If beam search is the answer, what was the question?

Quite surprisingly, exact maximum a posteriori (MAP) decoding of neural ...
research
08/12/2019

Neural Text Generation with Unlikelihood Training

Neural text generation is a key tool in natural language applications, b...
research
05/19/2022

RankGen: Improving Text Generation with Large Ranking Models

Given an input sequence (or prefix), modern language models often assign...
research
10/24/2022

Mutual Information Alleviates Hallucinations in Abstractive Summarization

Despite significant progress in the quality of language generated from a...

Please sign up or login with your details

Forgot password? Click here to reset