Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM's Translation Capability

05/17/2023
by   Eleftheria Briakou, et al.
0

Large, multilingual language models exhibit surprisingly good zero- or few-shot machine translation capabilities, despite having never seen the intentionally-included translation examples provided to typical neural translation systems. We investigate the role of incidental bilingualism – the unintentional consumption of bilingual signals, including translation examples – in explaining the translation capabilities of large language models, taking the Pathways Language Model (PaLM) as a case study. We introduce a mixed-method approach to measure and understand incidental bilingualism at scale. We show that PaLM is exposed to over 30 million translation pairs across at least 44 languages. Furthermore, the amount of incidental bilingual content is highly correlated with the amount of monolingual in-language content for non-English languages. We relate incidental bilingual content to zero-shot prompts and show that it can be used to mine new prompts to improve PaLM's out-of-English zero-shot translation quality. Finally, in a series of small-scale ablations, we show that its presence has a substantial impact on translation capabilities, although this impact diminishes with model scale.

READ FULL TEXT

page 5

page 16

research
02/07/2022

Cedille: A large autoregressive French language model

Scaling up the size and training of autoregressive language models has e...
research
07/30/2021

Towards Universality in Multilingual Text Rewriting

In this work, we take the first steps towards building a universal rewri...
research
10/27/2022

What Language Model to Train if You Have One Million GPU Hours?

The crystallization of modeling methods around the Transformer architect...
research
12/30/2020

Improving Zero-Shot Translation by Disentangling Positional Information

Multilingual neural machine translation has shown the capability of dire...
research
05/26/2023

RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation

Attribute-controlled translation (ACT) is a subtask of machine translati...
research
03/27/2023

Linguistically Informed ChatGPT Prompts to Enhance Japanese-Chinese Machine Translation: A Case Study on Attributive Clauses

In the field of Japanese-Chinese translation linguistics, the issue of c...
research
05/02/2022

Semantically Informed Slang Interpretation

Slang is a predominant form of informal language making flexible and ext...

Please sign up or login with your details

Forgot password? Click here to reset