Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation

05/17/2023
by   Markus Freitag, et al.
0

Recent advances in machine translation (MT) have shown that Minimum Bayes Risk (MBR) decoding can be a powerful alternative to beam search decoding, especially when combined with neural-based utility functions. However, the performance of MBR decoding depends heavily on how and how many candidates are sampled from the model. In this paper, we explore how different sampling approaches for generating candidate lists for MBR decoding affect performance. We evaluate popular sampling approaches, such as ancestral, nucleus, and top-k sampling. Based on our insights into their limitations, we experiment with the recently proposed epsilon-sampling approach, which prunes away all tokens with a probability smaller than epsilon, ensuring that each token in a sample receives a fair probability mass. Through extensive human evaluations, we demonstrate that MBR decoding based on epsilon-sampling significantly outperforms not only beam search decoding, but also MBR decoding with all other tested sampling methods across four language pairs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/10/2021

Sampling-Based Minimum Bayes Risk Decoding for Neural Machine Translation

In neural machine translation (NMT), we search for the mode of the model...
research
09/06/2023

Improving Code Generation by Dynamic Temperature Sampling

Recently, Large Language Models (LLMs) have shown impressive results in ...
research
09/14/2023

Masked Generative Modeling with Enhanced Sampling Scheme

This paper presents a novel sampling scheme for masked non-autoregressiv...
research
03/01/2022

RMBR: A Regularized Minimum Bayes Risk Reranking Framework for Machine Translation

Beam search is the most widely used decoding method for neural machine t...
research
09/19/2023

MBR and QE Finetuning: Training-time Distillation of the Best and Most Expensive Decoding Methods

Recent research in decoding methods for Natural Language Generation (NLG...
research
10/18/2022

Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models

Decoding methods for large language models often trade-off between diver...
research
11/14/2022

Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding

In open-ended natural-language generation, existing text decoding method...

Please sign up or login with your details

Forgot password? Click here to reset