A Survey of Methods to Leverage Monolingual Data in Low-resource Neural Machine Translation

10/01/2019
by   Ilshat Gibadullin, et al.
0

Neural machine translation has become the state-of-the-art for language pairs with large parallel corpora. However, the quality of machine translation for low-resource languages leaves much to be desired. There are several approaches to mitigate this problem, such as transfer learning, semi-supervised and unsupervised learning techniques. In this paper, we review the existing methods, where the main idea is to exploit the power of monolingual data, which, compared to parallel, is usually easier to obtain and significantly greater in amount.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2021

Extremely low-resource machine translation for closely related languages

An effective method to improve extremely low-resource neural machine tra...
research
06/15/2016

Semi-Supervised Learning for Neural Machine Translation

While end-to-end neural machine translation (NMT) has made remarkable pr...
research
03/11/2015

On Using Monolingual Corpora in Neural Machine Translation

Recent work on end-to-end neural network-based architectures for machine...
research
04/05/2020

Incorporating Bilingual Dictionaries for Low Resource Semi-Supervised Neural Machine Translation

We explore ways of incorporating bilingual dictionaries to enable semi-s...
research
04/30/2020

Language Model Prior for Low-Resource Neural Machine Translation

The scarcity of large parallel corpora is an important obstacle for neur...
research
04/22/2020

When and Why is Unsupervised Neural Machine Translation Useless?

This paper studies the practicality of the current state-of-the-art unsu...
research
10/13/2022

Low-resource Neural Machine Translation with Cross-modal Alignment

How to achieve neural machine translation with limited parallel data? Ex...

Please sign up or login with your details

Forgot password? Click here to reset