Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel Data

05/31/2021
by   Wei-Jen Ko, et al.
7

The scarcity of parallel data is a major obstacle for training high-quality machine translation systems for low-resource languages. Fortunately, some low-resource languages are linguistically related or similar to high-resource languages; these related languages may share many lexical or syntactic structures. In this work, we exploit this linguistic overlap to facilitate translating to and from a low-resource language with only monolingual data, in addition to any parallel data in the related high-resource language. Our method, NMT-Adapt, combines denoising autoencoding, back-translation and adversarial objectives to utilize monolingual data for low-resource adaptation. We experiment on 7 languages from three different language families and show that our technique significantly improves translation into low-resource language compared to other translation baselines.

READ FULL TEXT
research
02/27/2022

OCR Improves Machine Translation for Low-Resource Languages

We aim to investigate the performance of current OCR systems on low reso...
research
11/07/2019

Low-Resource Machine Translation using Interlinear Glosses

Neural Machine Translation (NMT) does not handle low-resource translatio...
research
02/23/2017

Utilizing Lexical Similarity between Related, Low-resource Languages for Pivot-based SMT

We investigate pivot-based translation between related languages in a lo...
research
05/23/2023

A Simple Method for Unsupervised Bilingual Lexicon Induction for Data-Imbalanced, Closely Related Language Pairs

Existing approaches for unsupervised bilingual lexicon induction (BLI) o...
research
04/19/2023

The eBible Corpus: Data and Model Benchmarks for Bible Translation for Low-Resource Languages

Efficiently and accurately translating a corpus into a low-resource lang...
research
05/23/2023

When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model Scale

Multilingual machine translation (MMT), trained on a mixture of parallel...
research
01/20/2023

Is ChatGPT A Good Translator? A Preliminary Study

This report provides a preliminary evaluation of ChatGPT for machine tra...

Please sign up or login with your details

Forgot password? Click here to reset