Learning to Stop in Structured Prediction for Neural Machine Translation

04/01/2019
by   Mingbo Ma, et al.
0

Beam search optimization resolves many issues in neural machine translation. However, this method lacks principled stopping criteria and does not learn how to stop during training, and the model naturally prefers the longer hypotheses during the testing time in practice since they use the raw score instead of the probability-based score. We propose a novel ranking method which enables an optimal beam search stopping criteria. We further introduce a structured prediction loss function which penalizes suboptimal finished candidates produced by beam search during training. Experiments of neural machine translation on both synthetic data and real languages (German-to-English and Chinese-to-English) demonstrate our proposed methods lead to better length and BLEU score.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/28/2018

Breaking the Beam Search Curse: A Study of (Re-)Scoring Methods and Stopping Criteria for Neural Machine Translation

Beam search is widely used in neural machine translation, and usually im...
research
06/29/2021

Rethinking the Evaluation of Neural Machine Translation

The evaluation of neural machine translation systems is usually built up...
research
01/31/2019

Learning Efficient Lexically-Constrained Neural Machine Translation with External Memory

Recent years has witnessed dramatic progress of neural machine translati...
research
05/29/2018

Distilling Knowledge for Search-based Structured Prediction

Many natural language processing tasks can be modeled into structured pr...
research
05/02/2022

Jam or Cream First? Modeling Ambiguity in Neural Machine Translation with SCONES

The softmax layer in neural machine translation is designed to model the...
research
05/02/2022

The Implicit Length Bias of Label Smoothing on Beam Search Decoding

Label smoothing is ubiquitously applied in Neural Machine Translation (N...
research
07/18/2017

Top-Rank Enhanced Listwise Optimization for Statistical Machine Translation

Pairwise ranking methods are the basis of many widely used discriminativ...

Please sign up or login with your details

Forgot password? Click here to reset