Neural Machine Translation Training in a Multi-Domain Scenario

08/29/2017
by   Hassan Sajjad, et al.
0

In this paper, we explore alternative ways to train a neural machine translation system in a multi-domain scenario. We investigate data concatenation (with fine tuning), model stacking (multi-level fine tuning), data selection and weighted ensemble. We evaluate these methods based on three criteria: i) translation quality, ii) training time, and iii) robustness towards out-of-domain tests. Our findings on Arabic-English and German-English language pairs show that the best translation quality can be achieved by building an initial system on a concatenation of available out-of-domain data and then fine-tuning it on in-domain data. Model stacking works best when training begins with the furthest out-of-domain data and the model is incrementally fine-tuned with the next furthest domain and so on. Data selection did not give the best results, but can be considered as a decent compromise between training time and translation quality. A weighted ensemble of different individual models performed better than data selection. It is beneficial in a scenario when there is no time for fine-tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2019

Domain Adaptive Inference for Neural Machine Translation

We investigate adaptive ensemble weighting for Neural Machine Translatio...
research
09/26/2019

Selecting Artificially-Generated Sentences for Fine-Tuning Neural Machine Translation

Neural Machine Translation (NMT) models tend to achieve best performance...
research
01/14/2017

QCRI Machine Translation Systems for IWSLT 16

This paper describes QCRI's machine translation systems for the IWSLT 20...
research
08/26/2019

Transductive Data-Selection Algorithms for Fine-Tuning Neural Machine Translation

Machine Translation models are trained to translate a variety of documen...
research
05/23/2023

Non-parametric, Nearest-neighbor-assisted Fine-tuning for Neural Machine Translation

Non-parametric, k-nearest-neighbor algorithms have recently made inroads...
research
10/21/2022

Revisiting Checkpoint Averaging for Neural Machine Translation

Checkpoint averaging is a simple and effective method to boost the perfo...
research
10/23/2022

Additive Interventions Yield Robust Multi-Domain Machine Translation Models

Additive interventions are a recently-proposed mechanism for controlling...

Please sign up or login with your details

Forgot password? Click here to reset