A cost-benefit analysis of cross-lingual transfer methods

05/14/2021
by   Guilherme Moraes Rosa, et al.
0

An effective method for cross-lingual transfer is to fine-tune a bilingual or multilingual model on a supervised dataset in one language and evaluating it on another language in a zero-shot manner. Translating examples at training time or inference time are also viable alternatives. However, there are costs associated with these methods that are rarely addressed in the literature. In this work, we analyze cross-lingual methods in terms of their effectiveness (e.g., accuracy), development and deployment costs, as well as their latencies at inference time. Our experiments on three tasks indicate that the best cross-lingual method is highly task-dependent. Finally, by combining zero-shot and translation methods, we achieve the state-of-the-art in two of the three datasets used in this work. Based on these results, we question the need for manually labeled training data in a target language. Code, models and translated datasets are available at https://github.com/unicamp-dl/cross-lingual-analysis

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2023

Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer

Zero-shot cross-lingual transfer is a central task in multilingual NLP, ...
research
12/04/2022

Cross-lingual Similarity of Multilingual Representations Revisited

Related works used indexes like CKA and variants of CCA to measure the s...
research
10/18/2022

Synergy with Translation Artifacts for Training and Inference in Multilingual Tasks

Translation has played a crucial role in improving the performance on mu...
research
06/02/2023

Distilling Efficient Language-Specific Models for Cross-Lingual Transfer

Massively multilingual Transformers (MMTs), such as mBERT and XLM-R, are...
research
10/23/2020

Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond

Cross-lingual adaptation with multilingual pre-trained language models (...
research
10/10/2020

Zero-Shot Translation Quality Estimation with Explicit Cross-Lingual Patterns

This paper describes our submission of the WMT 2020 Shared Task on Sente...
research
03/24/2022

Revisiting the Effects of Leakage on Dependency Parsing

Recent work by Søgaard (2020) showed that, treebank size aside, overlap ...

Please sign up or login with your details

Forgot password? Click here to reset