Investigating Label Bias in Beam Search for Open-ended Text Generation

05/22/2020
by   Liang Wang, et al.
0

Beam search is an effective and widely used decoding algorithm in many sequence-to-sequence (seq2seq) text generation tasks. However, in open-ended text generation, beam search is often found to produce repetitive and generic texts, sampling-based decoding algorithms like top-k sampling and nucleus sampling are more preferred. Standard seq2seq models suffer from label bias due to its locally normalized probability formulation. This paper provides a series of empirical evidence that label bias is a major reason for such degenerate behaviors of beam search. By combining locally normalized maximum likelihood estimation and globally normalized sequence-level training, label bias can be reduced with almost no sacrifice in perplexity. To quantitatively measure label bias, we test the model's ability to discriminate the groundtruth text and a set of context-agnostic distractors. We conduct experiments on large-scale response generation datasets. Results show that beam search can produce more diverse and meaningful texts with our approach, in terms of both automatic and human evaluation metrics. Our analysis also suggests several future working directions towards the grand challenge of open-ended text generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/02/2021

MAUVE: Human-Machine Divergence Curves for Evaluating Open-Ended Text Generation

Despite major advances in open-ended text generation, there has been lim...
research
04/22/2020

Residual Energy-Based Models for Text Generation

Text generation is ubiquitous in many NLP tasks, from summarization, to ...
research
08/12/2019

Neural Text Generation with Unlikelihood Training

Neural text generation is a key tool in natural language applications, b...
research
12/08/2022

Momentum Calibration for Text Generation

The input and output of most text generation tasks can be transformed to...
research
10/10/2020

An Empirical Investigation of Beam-Aware Training in Supertagging

Structured prediction is often approached by training a locally normaliz...
research
06/07/2021

A Globally Normalized Neural Model for Semantic Parsing

In this paper, we propose a globally normalized model for context-free g...

Please sign up or login with your details

Forgot password? Click here to reset