Model and Data Transfer for Cross-Lingual Sequence Labelling in Zero-Resource Settings

10/23/2022
by   Iker García-Ferrero, et al.
0

Zero-resource cross-lingual transfer approaches aim to apply supervised models from a source language to unlabelled target languages. In this paper we perform an in-depth study of the two main techniques employed so far for cross-lingual zero-resource sequence labelling, based either on data or model transfer. Although previous research has proposed translation and annotation projection (data-based cross-lingual transfer) as an effective technique for cross-lingual sequence labelling, in this paper we experimentally demonstrate that high capacity multilingual language models applied in a zero-shot (model-based cross-lingual transfer) setting consistently outperform data-based cross-lingual transfer approaches. A detailed analysis of our results suggests that this might be due to important differences in language use. More specifically, machine translation often generates a textual signal which is different to what the models are exposed to when using gold standard data, which affects both the fine-tuning and evaluation processes. Our results also indicate that data-based cross-lingual transfer approaches remain a competitive option when high-capacity multilingual language models are not available.

READ FULL TEXT
research
04/05/2022

Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual Retrieval

State-of-the-art neural (re)rankers are notoriously data hungry which - ...
research
10/23/2020

Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond

Cross-lingual adaptation with multilingual pre-trained language models (...
research
05/26/2023

Towards a Common Understanding of Contributing Factors for Cross-Lingual Transfer in Multilingual Language Models: A Review

In recent years, pre-trained Multilingual Language Models (MLLMs) have s...
research
04/17/2021

Multilingual and Cross-Lingual Intent Detection from Spoken Data

We present a systematic study on multilingual and cross-lingual intent d...
research
09/12/2017

Cross-lingual Word Segmentation and Morpheme Segmentation as Sequence Labelling

This paper presents our segmentation system developed for the MLP 2017 s...
research
12/04/2022

Cross-lingual Similarity of Multilingual Representations Revisited

Related works used indexes like CKA and variants of CCA to measure the s...
research
01/26/2021

Analyzing Zero-shot Cross-lingual Transfer in Supervised NLP Tasks

In zero-shot cross-lingual transfer, a supervised NLP task trained on a ...

Please sign up or login with your details

Forgot password? Click here to reset