DeepAI AI Chat
Log In Sign Up

Lego-MT: Towards Detachable Models in Massively Multilingual Machine Translation

12/20/2022
by   Fei Yuan, et al.
0

Traditional multilingual neural machine translation (MNMT) uses a single model to translate all directions. However, with the increasing scale of language pairs, simply using a single model for massive MNMT brings new challenges: parameter tension and large computations. In this paper, we revisit multi-way structures by assigning an individual branch for each language (group). Despite being a simple architecture, it is challenging to train de-centralized models due to the lack of constraints to align representations from all languages. We propose a localized training recipe to map different branches into a unified space, resulting in an efficient detachable model, Lego-MT. For a fair comparison, we collect data from OPUS and build the first large-scale open-source translation benchmark covering 7 language-centric data, each containing 445 language pairs. Experiments show that Lego-MT (1.2B) brings gains of more than 4 BLEU while outperforming M2M-100 (12B) (We will public all training data, models, and checkpoints)

READ FULL TEXT
10/21/2020

Beyond English-Centric Multilingual Machine Translation

Existing work in translation demonstrated the potential of massively mul...
10/21/2022

University of Cape Town's WMT22 System: Multilingual Machine Translation for Southern African Languages

The paper describes the University of Cape Town's submission to the cons...
05/19/2021

Learning Language Specific Sub-network for Multilingual Machine Translation

Multilingual neural machine translation aims at learning a single transl...
05/14/2022

Improving Neural Machine Translation of Indigenous Languages with Multilingual Transfer Learning

Machine translation (MT) involving Indigenous languages, including those...
10/07/2020

Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

We investigate the following question for machine translation (MT): can ...
11/14/2022

Findings of the Covid-19 MLIA Machine Translation Task

This work presents the results of the machine translation (MT) task from...
09/22/2021

Scalable and Efficient MoE Training for Multitask Multilingual Models

The Mixture of Experts (MoE) models are an emerging class of sparsely ac...