Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding

11/14/2022
by   Mirac Suzgun, et al.
0

In open-ended natural-language generation, existing text decoding methods typically struggle to produce text which is both diverse and high-quality. Greedy and beam search are known to suffer from text degeneration and linguistic diversity issues, while temperature, top-k, and nucleus sampling often yield diverse but low-quality outputs. In this work, we present crowd sampling, a family of decoding methods based on Bayesian risk minimization, to address this diversity-quality trade-off. Inspired by the principle of "the wisdom of the crowd," crowd sampling seeks to select a candidate from a pool of candidates that has the least expected risk (i.e., highest expected reward) under a generative model according to a given utility function. Crowd sampling can be seen as a generalization of numerous existing methods, including majority voting, and in practice, it can be used as a drop-in replacement for existing sampling methods. Extensive experiments show that crowd sampling delivers improvements of 3-7 ROUGE and BLEU points across a wide range of tasks, including summarization, data-to-text, translation, and textual style transfer, while achieving new state-of-the-art results on WebNLG and WMT'16.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2022

Best-k Search Algorithm for Neural Text Generation

Modern natural language generation paradigms require a good decoding str...
research
06/14/2019

Comparison of Diverse Decoding Methods from Conditional Language Models

While conditional language models have greatly improved in their ability...
research
10/18/2022

Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models

Decoding methods for large language models often trade-off between diver...
research
05/17/2023

Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation

Recent advances in machine translation (MT) have shown that Minimum Baye...
research
05/08/2021

Neural Text Generation with Part-of-Speech Guided Softmax

Neural text generation models are likely to suffer from the low-diversit...
research
12/14/2021

Massive-scale Decoding for Text Generation using Lattices

Neural text generation models like those used for summarization and tran...
research
06/02/2023

KL-Divergence Guided Temperature Sampling

Temperature sampling is a conventional approach to diversify large langu...

Please sign up or login with your details

Forgot password? Click here to reset