Adapting MARBERT for Improved Arabic Dialect Identification: Submission to the NADI 2021 Shared Task

03/01/2021
by   Badr AlKhamissi, et al.
0

In this paper, we tackle the Nuanced Arabic Dialect Identification (NADI) shared task (Abdul-Mageed et al., 2021) and demonstrate state-of-the-art results on all of its four subtasks. Tasks are to identify the geographic origin of short Dialectal (DA) and Modern Standard Arabic (MSA) utterances at the levels of both country and province. Our final model is an ensemble of variants built on top of MARBERT that achieves an F1-score of 34.03 the country-level development set – an improvement of 7.63 work.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2021

BERT-based Multi-Task Model for Country and Province Level Modern Standard Arabic and Dialectal Arabic Identification

Dialect and standard language identification are crucial tasks for many ...
research
03/04/2021

NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task

We present the findings and results of the Second Nuanced Arabic Dialect...
research
02/19/2021

Dialect Identification in Nuanced Arabic Tweets Using Farasa Segmentation and AraBERT

This paper presents our approach to address the EACL WANLP-2021 Shared T...
research
10/18/2022

NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task

We describe findings of the third Nuanced Arabic Dialect Identification ...
research
10/21/2022

Joint Coreference Resolution for Zeros and non-Zeros in Arabic

Most existing proposals about anaphoric zero pronoun (AZP) resolution re...
research
05/10/2021

Similarities between Arabic Dialects: Investigating Geographical Proximity

The automatic classification of Arabic dialects is an ongoing research c...
research
11/01/2020

Deep Diacritization: Efficient Hierarchical Recurrence for Improved Arabic Diacritization

We propose a novel architecture for labelling character sequences that a...

Please sign up or login with your details

Forgot password? Click here to reset