Neural machine translation for low-resource languages

08/18/2017
by   Robert Östling, et al.
0

Neural machine translation (NMT) approaches have improved the state of the art in many machine translation settings over the last couple of years, but they require large amounts of training data to produce sensible output. We demonstrate that NMT can be used for low-resource languages as well, by introducing more local dependencies and using word alignments to learn sentence reordering during translation. In addition to our novel model, we also present an empirical evaluation of low-resource phrase-based statistical machine translation (SMT) and NMT to investigate the lower limits of the respective technologies. We find that while SMT remains the best option for low-resource settings, our method can produce acceptable translations with only 70000 tokens of training data, a level where the baseline NMT system fails completely.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2019

Revisiting Low-Resource Neural Machine Translation: A Case Study

It has been shown that the performance of neural machine translation (NM...
research
03/04/2020

Evaluating Low-Resource Machine Translation between Chinese and Vietnamese with Back-Translation

Back translation (BT) has been widely used and become one of standard te...
research
09/17/2019

Pointer-based Fusion of Bilingual Lexicons into Neural Machine Translation

Neural machine translation (NMT) systems require large amounts of high q...
research
11/30/2020

Dynamic Curriculum Learning for Low-Resource Neural Machine Translation

Large amounts of data has made neural machine translation (NMT) a big su...
research
11/02/2018

Bi-Directional Differentiable Input Reconstruction for Low-Resource Neural Machine Translation

We aim to better exploit the limited amounts of parallel text available ...
research
03/07/2021

Translating the Unseen? Yorùbá → English MT in Low-Resource, Morphologically-Unmarked Settings

Translating between languages where certain features are marked morpholo...
research
07/13/2021

On the Difficulty of Translating Free-Order Case-Marking Languages

Identifying factors that make certain languages harder to model than oth...

Please sign up or login with your details

Forgot password? Click here to reset