Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability

by   Jiacheng Ye, et al.

Multilingual transfer ability, which reflects how well the models fine-tuned on one source language can be applied to other languages, has been well studied in multilingual pre-trained models (e.g., BLOOM). However, such ability has not been investigated for English-centric models (e.g., LLaMA). To fill this gap, we study the following research questions. First, does multilingual transfer ability exist in English-centric models and how does it compare with multilingual pretrained models? Second, does it only appears when English is the source language for the English-centric model? Third, how does it vary in different tasks? We take multilingual reasoning ability as our focus and conduct extensive experiments across four types of reasoning tasks. We find that the multilingual pretrained model does not always outperform an English-centric model. Furthermore, English appears to be a less suitable source language, and the choice of source language becomes less important when the English-centric model scales up. In addition, different types of tasks exhibit different multilingual transfer abilities. These findings demonstrate that English-centric models not only possess multilingual transfer ability but may even surpass the transferability of multilingual pretrained models if well-trained. By showing the strength and weaknesses, the experiments also provide valuable insights into enhancing multilingual reasoning abilities for the English-centric models.


page 1

page 2

page 3

page 4


An Empirical Investigation of Multi-bridge Multilingual NMT models

In this paper, we present an extensive investigation of multi-bridge, ma...

Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

Despite their success, large pre-trained multilingual models have not co...

Generalized Quantifiers as a Source of Error in Multilingual NLU Benchmarks

Logical approaches to representing language have developed and evaluated...

Relationship of the language distance to English ability of a country

Language difference is one of the factors that hinder the acquisition of...

Discrete and Soft Prompting for Multilingual Models

It has been shown for English that discrete and soft prompting perform s...

Local Structure Matters Most in Most Languages

Many recent perturbation studies have found unintuitive results on what ...

Understanding Crosslingual Transfer Mechanisms in Probabilistic Topic Modeling

Probabilistic topic modeling is a popular choice as the first step of cr...

Please sign up or login with your details

Forgot password? Click here to reset