The interplay between language similarity and script on a novel multi-layer Algerian dialect corpus

05/16/2021
by   Samia Touileb, et al.
0

Recent years have seen a rise in interest for cross-lingual transfer between languages with similar typology, and between languages of various scripts. However, the interplay between language similarity and difference in script on cross-lingual transfer is a less studied problem. We explore this interplay on cross-lingual transfer for two supervised tasks, namely part-of-speech tagging and sentiment analysis. We introduce a newly annotated corpus of Algerian user-generated comments comprising parallel annotations of Algerian written in Latin, Arabic, and code-switched scripts, as well as annotations for sentiment and topic categories. We perform baseline experiments by fine-tuning multi-lingual language models. We further explore the effect of script vs. language similarity in cross-lingual transfer by fine-tuning multi-lingual models on languages which are a) typologically distinct, but use the same script, b) typologically similar, but use a distinct script, or c) are typologically similar and use the same script. We find there is a delicate relationship between script and typology for part-of-speech, while sentiment analysis is less sensitive.

READ FULL TEXT
research
10/31/2022

Data-Efficient Cross-Lingual Transfer with Language-Specific Subnetworks

Large multilingual language models typically share their parameters acro...
research
06/05/2023

Exploring the Relationship between Alignment and Cross-lingual Transfer in Multilingual Transformers

Without any explicit cross-lingual training data, multilingual language ...
research
05/23/2023

MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African Languages

In this paper, we present MasakhaPOS, the largest part-of-speech (POS) d...
research
06/16/2020

Ranking Transfer Languages with Pragmatically-Motivated Features for Multilingual Sentiment Analysis

Cross-lingual transfer learning studies how datasets, annotations, and m...
research
06/13/2019

On the Effect of Word Order on Cross-lingual Sentiment Analysis

Current state-of-the-art models for sentiment analysis make use of word ...
research
05/07/2020

Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences

The patterns in which the syntax of different languages converges and di...
research
09/25/2022

An Empirical Study on Cross-X Transfer for Legal Judgment Prediction

Cross-lingual transfer learning has proven useful in a variety of Natura...

Please sign up or login with your details

Forgot password? Click here to reset