Automatic Difficulty Classification of Arabic Sentences

03/07/2021
by   Nouran Khallaf, et al.
0

In this paper, we present a Modern Standard Arabic (MSA) Sentence difficulty classifier, which predicts the difficulty of sentences for language learners using either the CEFR proficiency levels or the binary classification as simple or complex. We compare the use of sentence embeddings of different kinds (fastText, mBERT , XLM-R and Arabic-BERT), as well as traditional language features such as POS tags, dependency trees, readability scores and frequency lists for language learners. Our best results have been achieved using fined-tuned Arabic-BERT. The accuracy of our 3-way CEFR classification is F-1 of 0.80 and 0.75 for Arabic-Bert and XLM-R classification respectively and 0.71 Spearman correlation for regression. Our binary difficulty classifier reaches F-1 0.94 and F-1 0.98 for sentence-pair semantic similarity classifier.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2022

Towards Arabic Sentence Simplification via Classification and Generative Approaches

This paper presents an attempt to build a Modern Standard Arabic (MSA) s...
research
05/14/2021

DaLAJ - a dataset for linguistic acceptability judgments for Swedish: Format, baseline, sharing

We present DaLAJ 1.0, a Dataset for Linguistic Acceptability Judgments f...
research
07/11/2020

I3rab: A New Arabic Dependency Treebank Based on Arabic Grammatical Theory

Treebanks are valuable linguistic resources that include the syntactic s...
research
01/29/2019

An Arabic Dependency Treebank in the Travel Domain

In this paper we present a dependency treebank of travel domain sentence...
research
03/23/2021

The Success of AdaBoost and Its Application in Portfolio Management

We develop a novel approach to explain why AdaBoost is a successful clas...
research
04/24/2020

The Inception Team at NSURL-2019 Task 8: Semantic Question Similarity in Arabic

This paper describes our method for the task of Semantic Question Simila...
research
05/04/2023

Leveraging BERT Language Model for Arabic Long Document Classification

Given the number of Arabic speakers worldwide and the notably large amou...

Please sign up or login with your details

Forgot password? Click here to reset