Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation

05/24/2023
by   Bashar Alhafni, et al.
0

Grammatical error correction (GEC) is a well-explored problem in English with many existing models and datasets. However, research on GEC in morphologically rich languages has been limited due to challenges such as data scarcity and language complexity. In this paper, we present the first results on Arabic GEC by using two newly developed Transformer-based pretrained sequence-to-sequence models. We address the task of multi-class Arabic grammatical error detection (GED) and present the first results on multi-class Arabic GED. We show that using GED information as auxiliary input in GEC models improves GEC performance across three datasets spanning different genres. Moreover, we also investigate the use of contextual morphological preprocessing in aiding GEC systems. Our models achieve state-of-the-art results on two Arabic GEC shared tasks datasets and establish a strong benchmark on a newly created dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/21/2022

AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive Summarization

Like most natural language understanding and generation tasks, state-of-...
research
09/16/2021

Automatic Error Type Annotation for Arabic

We present ARETA, an automatic error type annotation system for Modern S...
research
08/14/2018

Classifier Ensembles for Dialect and Language Variety Identification

In this paper we present ensemble-based systems for dialect and language...
research
02/28/2020

AraBERT: Transformer-based Model for Arabic Language Understanding

The Arabic language is a morphologically rich and complex language with ...
research
05/12/2021

Spelling Correction with Denoising Transformer

We present a novel method of performing spelling correction on short inp...
research
03/18/2021

Let-Mi: An Arabic Levantine Twitter Dataset for Misogynistic Language

Online misogyny has become an increasing worry for Arab women who experi...
research
08/02/2021

Correcting Arabic Soft Spelling Mistakes using BiLSTM-based Machine Learning

Soft spelling errors are a class of spelling mistakes that is widespread...

Please sign up or login with your details

Forgot password? Click here to reset