SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages

10/20/2022
by   Alireza Mohammadshahi, et al.
23

In recent years, multilingual machine translation models have achieved promising performance on low-resource language pairs by sharing information between similar languages, thus enabling zero-shot translation. To overcome the "curse of multilinguality", these models often opt for scaling up the number of parameters, which makes their use in resource-constrained environments challenging. We introduce SMaLL-100, a distilled version of the M2M-100 (12B) model, a massively multilingual machine translation model covering 100 languages. We train SMaLL-100 with uniform sampling across all language pairs and therefore focus on preserving the performance of low-resource languages. We evaluate SMaLL-100 on different low-resource benchmarks: FLORES-101, Tatoeba, and TICO-19 and demonstrate that it outperforms previous massively multilingual models of comparable sizes (200-600M) while improving inference latency and memory usage. Additionally, our model achieves comparable results to M2M-100 (1.2B), while being 3.6x smaller and 4.3x faster at inference. Code and pre-trained models: https://github.com/alirezamshi/small100

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2022

Refining Low-Resource Unsupervised Translation by Language Disentanglement of Multilingual Model

Numerous recent work on unsupervised machine translation (UMT) implies t...
research
06/14/2023

Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations

Vision-and-language (VL) models with separate encoders for each modality...
research
07/11/2022

No Language Left Behind: Scaling Human-Centered Machine Translation

Driven by the goal of eradicating language barriers on a global scale, m...
research
10/13/2020

The Tatoeba Translation Challenge – Realistic Data Sets for Low Resource and Multilingual MT

This paper describes the development of a new benchmark for machine tran...
research
04/15/2021

Demystify Optimization Challenges in Multilingual Transformers

Multilingual Transformer improves parameter efficiency and crosslingual ...
research
05/22/2023

Mitigating Data Imbalance and Representation Degeneration in Multilingual Machine Translation

Despite advances in multilingual neural machine translation (MNMT), we a...
research
12/01/2022

A Commonsense-Infused Language-Agnostic Learning Framework for Enhancing Prediction of Political Polarity in Multilingual News Headlines

Predicting the political polarity of news headlines is a challenging tas...

Please sign up or login with your details

Forgot password? Click here to reset