The unreasonable effectiveness of few-shot learning for machine translation

02/02/2023
by   Xavier Garcia, et al.
0

We demonstrate the potential of few-shot translation systems, trained with unpaired language data, for both high and low-resource language pairs. We show that with only 5 examples of high-quality translation data shown at inference, a transformer decoder-only model trained solely with self-supervised learning, is able to match specialized supervised state-of-the-art models as well as more general commercial translation systems. In particular, we outperform the best performing system on the WMT'21 English - Chinese news translation task by only using five examples of English - Chinese parallel data at inference. Moreover, our approach in building these models does not necessitate joint multilingual training or back-translation, is conceptually simple and shows the potential to extend to the multilingual setting. Furthermore, the resulting models are two orders of magnitude smaller than state-of-the-art language models. We then analyze the factors which impact the performance of few-shot translation systems, and highlight that the quality of the few-shot demonstrations heavily determines the quality of the translations generated by our models. Finally, we show that the few-shot paradigm also provides a way to control certain attributes of the translation – we show that we are able to control for regional varieties and formality using only a five examples at inference, paving the way towards controllable machine translation systems.

READ FULL TEXT
research
08/02/2022

AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

In this work, we demonstrate that multilingual large-scale sequence-to-s...
research
10/01/2022

MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation

Large pre-trained language models have brought remarkable progress in NL...
research
08/02/2023

Do Multilingual Language Models Think Better in English?

Translate-test is a popular technique to improve the performance of mult...
research
05/11/2023

Chain-of-Dictionary Prompting Elicits Translation in Large Language Models

Large language models (LLMs) have shown surprisingly good performance in...
research
06/08/2021

Self-supervised and Supervised Joint Training for Resource-rich Machine Translation

Self-supervised pre-training of text representations has been successful...
research
10/29/2019

An Empirical Study of Generation Order for Machine Translation

In this work, we present an empirical study of generation order for mach...

Please sign up or login with your details

Forgot password? Click here to reset