An Empirical Investigation of Multi-bridge Multilingual NMT models

10/14/2021
by   Anoop Kunchukuttan, et al.
0

In this paper, we present an extensive investigation of multi-bridge, many-to-many multilingual NMT models (MB-M2M) ie., models trained on non-English language pairs in addition to English-centric language pairs. In addition to validating previous work which shows that MB-M2M models can overcome zeroshot translation problems, our analysis reveals the following results about multibridge models: (1) it is possible to extract a reasonable amount of parallel corpora between non-English languages for low-resource languages (2) with limited non-English centric data, MB-M2M models are competitive with or outperform pivot models, (3) MB-M2M models can outperform English-Any models and perform at par with Any-English models, so a single multilingual NMT system can serve all translation directions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2019

Massively Multilingual Neural Machine Translation

Multilingual neural machine translation (NMT) enables training a single ...
research
06/11/2023

Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability

Multilingual transfer ability, which reflects how well the models fine-t...
research
03/31/2023

ℰ KÚ [MASK]: Integrating Yorùbá cultural greetings into machine translation

This paper investigates the performance of massively multilingual neural...
research
09/06/2022

Multilingual Bidirectional Unsupervised Translation Through Multilingual Finetuning and Back-Translation

We propose a two-stage training approach for developing a single NMT mod...
research
08/23/2022

MATra: A Multilingual Attentive Transliteration System for Indian Scripts

Transliteration is a task in the domain of NLP where the output word is ...
research
08/21/2018

Translational Grounding: Using Paraphrase Recognition and Generation to Demonstrate Semantic Abstraction Abilities of MultiLingual NMT

In this paper, we investigate whether multilingual neural translation mo...
research
05/23/2023

Sāmayik: A Benchmark and Dataset for English-Sanskrit Translation

Sanskrit is a low-resource language with a rich heritage. Digitized Sans...

Please sign up or login with your details

Forgot password? Click here to reset