Language Model Prior for Low-Resource Neural Machine Translation

04/30/2020
by   Christos Baziotis, et al.
0

The scarcity of large parallel corpora is an important obstacle for neural machine translation. A common solution is to exploit the knowledge of language models (LM) trained on abundant monolingual data. In this work, we propose a novel approach to incorporate a LM as prior in a neural translation model (TM). Specifically, we add a regularization term, which pushes the output distributions of the TM to be probable under the LM prior, while avoiding wrong predictions when the TM "disagrees" with the LM. This objective relates to knowledge distillation, where the LM can be viewed as teaching the TM about the target language. The proposed approach does not compromise decoding speed, because the LM is used only at training time, unlike previous work that requires it during inference. We present an analysis of the effects that different methods have on the distributions of the TM. Results on two low-resource machine translation datasets show clear improvements even with limited monolingual data.

READ FULL TEXT
research
10/01/2019

A Survey of Methods to Leverage Monolingual Data in Low-resource Neural Machine Translation

Neural machine translation has become the state-of-the-art for language ...
research
08/17/2019

Language Graph Distillation for Low-Resource Machine Translation

Neural machine translation on low-resource language is challenging due t...
research
06/08/2023

Improving Language Model Integration for Neural Machine Translation

The integration of language models for neural machine translation has be...
research
05/23/2023

When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model Scale

Multilingual machine translation (MMT), trained on a mixture of parallel...
research
12/24/2018

Moment Matching Training for Neural Machine Translation: A Preliminary Study

In previous works, neural sequence models have been shown to improve sig...
research
03/25/2022

Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation

Subword regularizations use multiple subword segmentations during traini...
research
11/02/2018

Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization

Although neural machine translation has made significant progress recent...

Please sign up or login with your details

Forgot password? Click here to reset