Regularization techniques for fine-tuning in neural machine translation

We investigate techniques for supervised domain adaptation for neural machine translation where an existing model trained on a large out-of-domain dataset is adapted to a small in-domain dataset. In this scenario, overfitting is a major challenge. We investigate a number of techniques to reduce overfitting and improve transfer learning, including regularization techniques such as dropout and L2-regularization towards an out-of-domain prior. In addition, we introduce tuneout, a novel regularization technique inspired by dropout. We apply these techniques, alone and in combination, to neural machine translation, obtaining improvements on IWSLT datasets for English->German and English->Russian. We also investigate the amounts of in-domain training data needed for domain adaptation in NMT, and find a logarithmic relationship between the amount of training data and gain in BLEU score.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/22/2020

Domain Adaptation of NMT models for English-Hindi Machine Translation Task at AdapMT ICON 2020

Recent advancements in Neural Machine Translation (NMT) models have prov...
research
01/22/2020

Unsupervised Domain Adaptation for Neural Machine Translation with Iterative Back Translation

State-of-the-art neural machine translation (NMT) systems are data-hungr...
research
04/09/2020

Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem

Training data for NLP tasks often exhibits gender bias in that fewer sen...
research
02/28/2022

LCP-dropout: Compression-based Multiple Subword Segmentation for Neural Machine Translation

In this study, we propose a simple and effective preprocessing method fo...
research
06/07/2019

Word-based Domain Adaptation for Neural Machine Translation

In this paper, we empirically investigate applying word-level weights to...
research
09/15/2021

Sequence Length is a Domain: Length-based Overfitting in Transformer Models

Transformer-based sequence-to-sequence architectures, while achieving st...
research
06/02/2019

Domain Adaptive Inference for Neural Machine Translation

We investigate adaptive ensemble weighting for Neural Machine Translatio...

Please sign up or login with your details

Forgot password? Click here to reset