Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages

02/07/2023
by   Simeng Sun, et al.
0

With multilingual machine translation (MMT) models continuing to grow in size and number of supported languages, it is natural to reuse and upgrade existing models to save computation as data becomes available in more languages. However, adding new languages requires updating the vocabulary, which complicates the reuse of embeddings. The question of how to reuse existing models while also making architectural changes to provide capacity for both old and new languages has also not been closely studied. In this work, we introduce three techniques that help speed up effective learning of the new languages and alleviate catastrophic forgetting despite vocabulary and architecture mismatches. Our results show that by (1) carefully initializing the network, (2) applying learning rate scaling, and (3) performing data up-sampling, it is possible to exceed the performance of a same-sized baseline model with 30 computation and recover the performance of a larger model trained from scratch with over 50 the introduced techniques help learn the new directions more effectively and alleviate catastrophic forgetting at the same time. We hope our work will guide research into more efficient approaches to growing languages for these MMT models and ultimately maximize the reuse of existing models.

READ FULL TEXT
research
11/21/2022

Towards continually learning new languages

Multilingual speech recognition with neural networks is often implemente...
research
03/11/2021

Towards Continual Learning for Multilingual Machine Translation via Vocabulary Substitution

We propose a straightforward vocabulary adaptation scheme to extend the ...
research
08/02/2020

Multilingual Translation with Extensible Multilingual Pretraining and Finetuning

Recent work demonstrates the potential of multilingual pretraining of cr...
research
11/03/2018

Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary

We propose a method to transfer knowledge across neural machine translat...
research
12/13/2021

English2Gbe: A multilingual machine translation model for Fon/EweGbe

Language is an essential factor of emancipation. Unfortunately, most of ...
research
05/04/2023

Learning Language-Specific Layers for Multilingual Machine Translation

Multilingual Machine Translation promises to improve translation quality...
research
09/14/2022

Parameter-Efficient Finetuning for Robust Continual Multilingual Learning

NLU systems deployed in the real world are expected to be regularly upda...

Please sign up or login with your details

Forgot password? Click here to reset