Dynamic Data Selection for Neural Machine Translation

08/02/2017
by   Marlies van der Wees, et al.
0

Intelligent selection of training data has proven a successful technique to simultaneously increase training efficiency and translation performance for phrase-based machine translation (PBMT). With the recent increase in popularity of neural machine translation (NMT), we explore in this paper to what extent and how NMT can also benefit from data selection. While state-of-the-art data selection (Axelrod et al., 2011) consistently performs well for PBMT, we show that gains are substantially lower for NMT. Next, we introduce dynamic data selection for NMT, a method in which we vary the selected subset of training data between different training epochs. Our experiments show that the best results are achieved when applying a technique we call gradual fine-tuning, with improvements up to +2.6 BLEU over the original data selection approach and up to +3.1 BLEU over a general baseline.

READ FULL TEXT
research
06/26/2017

English-Japanese Neural Machine Translation with Encoder-Decoder-Reconstructor

Neural machine translation (NMT) has recently become popular in the fiel...
research
05/08/2020

Neural Machine Translation for South Africa's Official Languages

Recent advances in neural machine translation (NMT) have led to state-of...
research
09/20/2020

Energy-Based Reranking: Improving Neural Machine Translation Using Energy-Based Models

The discrepancy between maximum likelihood estimation (MLE) and task mea...
research
10/15/2020

Pronoun-Targeted Fine-tuning for NMT with Hybrid Losses

Popular Neural Machine Translation model training uses strategies like b...
research
03/01/2023

A Systematic Analysis of Vocabulary and BPE Settings for Optimal Fine-tuning of NMT: A Case Study of In-domain Translation

The effectiveness of Neural Machine Translation (NMT) models largely dep...
research
07/06/2016

Guided Alignment Training for Topic-Aware Neural Machine Translation

In this paper, we propose an effective way for biasing the attention mec...
research
12/19/2016

Boosting Neural Machine Translation

Training efficiency is one of the main problems for Neural Machine Trans...

Please sign up or login with your details

Forgot password? Click here to reset