Iterative Domain-Repaired Back-Translation

10/06/2020
by   Hao-Ran Wei, et al.
0

In this paper, we focus on the domain-specific translation with low resources, where in-domain parallel corpora are scarce or nonexistent. One common and effective strategy for this case is exploiting in-domain monolingual data with the back-translation method. However, the synthetic parallel data is very noisy because they are generated by imperfect out-of-domain systems, resulting in the poor performance of domain adaptation. To address this issue, we propose a novel iterative domain-repaired back-translation framework, which introduces the Domain-Repair (DR) model to refine translations in synthetic bilingual data. To this end, we construct corresponding data for the DR model training by round-trip translating the monolingual sentences, and then design the unified training framework to optimize paired DR and NMT models jointly. Experiments on adapting NMT models between specific domains and from the general domain to specific domains demonstrate the effectiveness of our proposed approach, achieving 15.79 and 4.47 BLEU improvements on average over unadapted models and back-translation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2018

A Survey of Domain Adaptation for Neural Machine Translation

Neural machine translation (NMT) is a deep learning based approach for m...
research
01/29/2021

Synthesizing Monolingual Data for Neural Machine Translation

In neural machine translation (NMT), monolingual data in the target lang...
research
06/02/2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction

It has been previously noted that neural machine translation (NMT) is ve...
research
09/14/2021

Non-Parametric Unsupervised Domain Adaptation for Neural Machine Translation

Recently, kNN-MT has shown the promising capability of directly incorpor...
research
04/05/2020

AR: Auto-Repair the Synthetic Data for Neural Machine Translation

Compared with only using limited authentic parallel data as training cor...
research
04/07/2020

Dynamic Data Selection and Weighting for Iterative Back-Translation

Back-translation has proven to be an effective method to utilize monolin...
research
12/11/2021

Selecting Parallel In-domain Sentences for Neural Machine Translation Using Monolingual Texts

Continuously-growing data volumes lead to larger generic models. Specifi...

Please sign up or login with your details

Forgot password? Click here to reset