DeepAI AI Chat
Log In Sign Up

Morphosyntactic Tagging with Pre-trained Language Models for Arabic and its Dialects

10/13/2021
by   Go Inoue, et al.
0

We present state-of-the-art results on morphosyntactic tagging across different varieties of Arabic using fine-tuned pre-trained transformer language models. Our models consistently outperform existing systems in Modern Standard Arabic and all the Arabic dialects we study, achieving 2.6 improvement over the previous state-of-the-art in Modern Standard Arabic, 2.8 in Gulf, 1.6 setups for fine-tuning pre-trained transformer language models, including training data size, the use of external linguistic resources, and the use of annotated data from other dialects in a low-resource scenario. Our results show that strategic fine-tuning using datasets from other high-resource dialects is beneficial for a low-resource dialect. Additionally, we show that high-quality morphological analyzers as external linguistic resources are beneficial especially in low-resource settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/11/2021

The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models

In this paper, we explore the effects of language variants, data sizes, ...
12/04/2020

Fine-tuning BERT for Low-Resource Natural Language Understanding via Active Learning

Recently, leveraging pre-trained Transformer based language models in do...
04/14/2021

Zero-Resource Multi-Dialectal Arabic Natural Language Understanding

A reasonable amount of annotated data is required for fine-tuning pre-tr...
01/12/2021

Self-Training Pre-Trained Language Models for Zero- and Few-Shot Multi-Dialectal Arabic Sequence Labeling

A sufficient amount of annotated data is usually required to fine-tune p...
05/05/2022

Quantifying Language Variation Acoustically with Few Resources

Deep acoustic models represent linguistic information based on massive a...
05/22/2023

Evaluating Prompt-based Question Answering for Object Prediction in the Open Research Knowledge Graph

There have been many recent investigations into prompt-based training of...