Handling Syntactic Divergence in Low-resource Machine Translation

08/30/2019
by   Chunting Zhou, et al.
0

Despite impressive empirical successes of neural machine translation (NMT) on standard benchmarks, limited parallel data impedes the application of NMT models to many language pairs. Data augmentation methods such as back-translation make it possible to use monolingual data to help alleviate these issues, but back-translation itself fails in extreme low-resource scenarios, especially for syntactically divergent languages. In this paper, we propose a simple yet effective solution, whereby target-language sentences are re-ordered to match the order of the source and used as an additional source of training-time supervision. Experiments with simulated low-resource Japanese-to-English, and real low-resource Uyghur-to-English scenarios find significant improvements over other semi-supervised alternatives.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2019

Overcoming the Rare Word Problem for Low-Resource Language Pairs in Neural Machine Translation

Among the six challenges of neural machine translation (NMT) coined by (...
research
12/28/2021

A Preordered RNN Layer Boosts Neural Machine Translation in Low Resource Settings

Neural Machine Translation (NMT) models are strong enough to convey sema...
research
05/29/2018

Bi-Directional Neural Machine Translation with Synthetic Parallel Data

Despite impressive progress in high-resource settings, Neural Machine Tr...
research
04/02/2023

Semi-supervised Neural Machine Translation with Consistency Regularization for Low-Resource Languages

The advent of deep learning has led to a significant gain in machine tra...
research
06/10/2019

Generalized Data Augmentation for Low-Resource Translation

Translation to or from low-resource languages LRLs poses challenges for ...
research
03/24/2021

Low-Resource Machine Translation for Low-Resource Languages: Leveraging Comparable Data, Code-Switching and Compute Resources

We conduct an empirical study of unsupervised neural machine translation...
research
04/05/2020

Incorporating Bilingual Dictionaries for Low Resource Semi-Supervised Neural Machine Translation

We explore ways of incorporating bilingual dictionaries to enable semi-s...

Please sign up or login with your details

Forgot password? Click here to reset