Towards Better Chinese-centric Neural Machine Translation for Low-resource Languages

04/09/2022
by   Bin Li, et al.
0

The last decade has witnessed enormous improvements in science and technology, stimulating the growing demand for economic and cultural exchanges in various countries. Building a neural machine translation (NMT) system has become an urgent trend, especially in the low-resource setting. However, recent work tends to study NMT systems for low-resource languages centered on English, while few works focus on low-resource NMT systems centered on other languages such as Chinese. To achieve this, the low-resource multilingual translation challenge of the 2021 iFLYTEK AI Developer Competition provides the Chinese-centric multilingual low-resource NMT tasks, where participants are required to build NMT systems based on the provided low-resource samples. In this paper, we present the winner competition system that leverages monolingual word embeddings data enhancement, bilingual curriculum learning, and contrastive re-ranking. In addition, a new Incomplete-Trust (In-trust) loss function is proposed to replace the traditional cross-entropy loss when training. The experimental results demonstrate that the implementation of these ideas leads better performance than other state-of-the-art methods. All the experimental codes are released at: https://github.com/WENGSYX/Low-resource-text-translation.

READ FULL TEXT
research
05/28/2019

Revisiting Low-Resource Neural Machine Translation: A Case Study

It has been shown that the performance of neural machine translation (NM...
research
03/04/2020

Evaluating Low-Resource Machine Translation between Chinese and Vietnamese with Back-Translation

Back translation (BT) has been widely used and become one of standard te...
research
05/20/2019

Target Conditioned Sampling: Optimizing Data Selection for Multilingual Neural Machine Translation

To improve low-resource Neural Machine Translation (NMT) with multilingu...
research
11/30/2021

Low-Resource Machine Translation Training Curriculum Fit for Low-Resource Languages

We conduct an empirical study of neural machine translation (NMT) for tr...
research
09/07/2022

On the Complementarity between Pre-Training and Random-Initialization for Resource-Rich Machine Translation

Pre-Training (PT) of text representations has been successfully applied ...
research
07/18/2021

As Easy as 1, 2, 3: Behavioural Testing of NMT Systems for Numerical Translation

Mistranslated numbers have the potential to cause serious effects, such ...
research
04/09/2020

On optimal transformer depth for low-resource language translation

Transformers have shown great promise as an approach to Neural Machine T...

Please sign up or login with your details

Forgot password? Click here to reset