Enhancing Supervised Learning with Contrastive Markings in Neural Machine Translation Training

07/17/2023
by   Nathaniel Berger, et al.
0

Supervised learning in Neural Machine Translation (NMT) typically follows a teacher forcing paradigm where reference tokens constitute the conditioning context in the model's prediction, instead of its own previous predictions. In order to alleviate this lack of exploration in the space of translations, we present a simple extension of standard maximum likelihood estimation by a contrastive marking objective. The additional training signals are extracted automatically from reference translations by comparing the system hypothesis against the reference, and used for up/down-weighting correct/incorrect tokens. The proposed new training procedure requires one additional translation pass over the training set per epoch, and does not alter the standard inference setup. We show that training with contrastive markings yields improvements on top of supervised learning, and is especially useful when learning from postedits where contrastive markings indicate human error corrections to the original hypotheses. Code is publicly released.

READ FULL TEXT

page 3

page 5

research
03/03/2022

As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning

Omission and addition of content is a typical issue in neural machine tr...
research
12/14/2016

How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs

Analysing translation quality in regards to specific linguistic phenomen...
research
04/26/2022

When do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation?

Word alignment has proven to benefit many-to-many neural machine transla...
research
12/19/2016

Neural Machine Translation from Simplified Translations

Text simplification aims at reducing the lexical, grammatical and struct...
research
12/08/2015

Minimum Risk Training for Neural Machine Translation

We propose minimum risk training for end-to-end neural machine translati...
research
05/02/2022

Jam or Cream First? Modeling Ambiguity in Neural Machine Translation with SCONES

The softmax layer in neural machine translation is designed to model the...
research
04/01/2021

Sampling and Filtering of Neural Machine Translation Distillation Data

In most of neural machine translation distillation or stealing scenarios...

Please sign up or login with your details

Forgot password? Click here to reset