Domain Adaptation of Neural Machine Translation by Lexicon Induction

06/02/2019
by   Junjie Hu, et al.
0

It has been previously noted that neural machine translation (NMT) is very sensitive to domain shift. In this paper, we argue that this is a dual effect of the highly lexicalized nature of NMT, resulting in failure for sentences with large numbers of unknown words, and lack of supervision for domain-specific words. To remedy this problem, we propose an unsupervised adaptation method which fine-tunes a pre-trained out-of-domain NMT model using a pseudo-in-domain corpus. Specifically, we perform lexicon induction to extract an in-domain lexicon, and construct a pseudo-parallel in-domain corpus by performing word-for-word back-translation of monolingual in-domain target sentences. In five domains over twenty pairwise adaptation settings and two model architectures, our method achieves consistent improvements without using any in-domain parallel sentences, improving up to 14 BLEU over unadapted models, and up to 2 BLEU over strong back-translation baselines.

READ FULL TEXT
research
04/30/2020

Vocabulary Adaptation for Distant Domain Adaptation in Neural Machine Translation

Neural machine translation (NMT) models do not work well in domains diff...
research
11/07/2019

Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing

Many multi-domain neural machine translation (NMT) models achieve knowle...
research
10/28/2022

Domain Adaptation of Machine Translation with Crowdworkers

Although a machine translation model trained with a large in-domain para...
research
09/23/2021

Exploiting Curriculum Learning in Unsupervised Neural Machine Translation

Back-translation (BT) has become one of the de facto components in unsup...
research
04/20/2022

DaLC: Domain Adaptation Learning Curve Prediction for Neural Machine Translation

Domain Adaptation (DA) of Neural Machine Translation (NMT) model often r...
research
06/21/2021

Phrase-level Active Learning for Neural Machine Translation

Neural machine translation (NMT) is sensitive to domain shift. In this p...
research
10/06/2020

Iterative Domain-Repaired Back-Translation

In this paper, we focus on the domain-specific translation with low reso...

Please sign up or login with your details

Forgot password? Click here to reset