Simulated Multiple Reference Training Improves Low-Resource Machine Translation

04/30/2020
by   Huda Khayrallah, et al.
0

Many valid translations exist for a given sentence, and yet machine translation (MT) is trained with a single reference translation, exacerbating data sparsity in low-resource settings. We introduce a novel MT training method that approximates the full space of possible translations by: sampling a paraphrase of the reference sentence from a paraphraser and training the MT model to predict the paraphraser's distribution over possible tokens. With an English paraphraser, we demonstrate the effectiveness of our method in low-resource settings, with gains of 1.2 to 7 BLEU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2019

MetaMT,a MetaLearning Method Leveraging Multiple Domain Data for Low Resource Machine Translation

Manipulating training data leads to robust neural models for MT....
research
07/03/2021

Can Transformers Jump Around Right in Natural Language? Assessing Performance Transfer from SCAN

Despite their practical success, modern seq2seq architectures are unable...
research
05/11/2023

Subword Segmental Machine Translation: Unifying Segmentation and Target Sentence Generation

Subword segmenters like BPE operate as a preprocessing step in neural ma...
research
06/20/2023

EvolveMT: an Ensemble MT Engine Improving Itself with Usage Only

This paper presents EvolveMT for efficiently combining multiple machine ...
research
03/20/2022

Small Batch Sizes Improve Training of Low-Resource Neural MT

We study the role of an essential hyper-parameter that governs the train...
research
05/08/2023

Target-Side Augmentation for Document-Level Machine Translation

Document-level machine translation faces the challenge of data sparsity ...
research
05/24/2022

Principled Paraphrase Generation with Parallel Corpora

Round-trip Machine Translation (MT) is a popular choice for paraphrase g...

Please sign up or login with your details

Forgot password? Click here to reset