Decoding and Diversity in Machine Translation

11/26/2020
by   Nicholas Roberts, et al.
6

Neural Machine Translation (NMT) systems are typically evaluated using automated metrics that assess the agreement between generated translations and ground truth candidates. To improve systems with respect to these metrics, NLP researchers employ a variety of heuristic techniques, including searching for the conditional mode (vs. sampling) and incorporating various training heuristics (e.g., label smoothing). While search strategies significantly improve BLEU score, they yield deterministic outputs that lack the diversity of human translations. Moreover, search tends to bias the distribution of translated gender pronouns. This makes human-level BLEU a misleading benchmark in that modern MT systems cannot approach human-level BLEU while simultaneously maintaining human-level translation diversity. In this paper, we characterize distributional differences between generated and real translations, examining the cost in diversity paid for the BLEU scores enjoyed by NMT. Moreover, our study implicates search as a salient source of known bias when translating gender pronouns.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2019

Beyond BLEU: Training Neural Machine Translation with Semantic Similarity

While most neural machine translation (NMT) systems are still trained us...
research
06/01/2021

Gender Bias Amplification During Speed-Quality Optimization in Neural Machine Translation

Is bias amplified when neural machine translation (NMT) models are optim...
research
09/06/2023

Gender-specific Machine Translation with Large Language Models

Decoder-only Large Language Models (LLMs) have demonstrated potential in...
research
04/15/2021

First the worst: Finding better gender translations during beam search

Neural machine translation inference procedures like beam search generat...
research
09/21/2020

Target Conditioning for One-to-Many Generation

Neural Machine Translation (NMT) models often lack diversity in their ge...
research
10/20/2020

Human-Paraphrased References Improve Neural Machine Translation

Automatic evaluation comparing candidate translations to human-generated...
research
02/28/2018

Analyzing Uncertainty in Neural Machine Translation

Machine translation is a popular test bed for research in neural sequenc...

Please sign up or login with your details

Forgot password? Click here to reset