Automatic Error Type Annotation for Arabic

by   Riadh Belkebir, et al.

We present ARETA, an automatic error type annotation system for Modern Standard Arabic. We design ARETA to address Arabic's morphological richness and orthographic ambiguity. We base our error taxonomy on the Arabic Learner Corpus (ALC) Error Tagset with some modifications. ARETA achieves a performance of 85.8 ALC. We also demonstrate ARETA's usability by applying it to a number of submissions from the QALB 2014 shared task for Arabic grammatical error correction. The resulting analyses give helpful insights on the strengths and weaknesses of different submissions, which is more useful than the opaque M2 scoring metrics used in the shared task. ARETA employs a large Arabic morphological analyzer, but is completely unsupervised otherwise. We make ARETA publicly available.



There are no comments yet.


page 5


A Large Scale Corpus of Gulf Arabic

Most Arabic natural language processing tools and resources are develope...

MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction

In this paper, we introduce MADARi, a joint morphological annotation and...

Semi-Automatic Data Annotation, POS Tagging and Mildly Context-Sensitive Disambiguation: the eXtended Revised AraMorph (XRAM)

An extended, revised form of Tim Buckwalter's Arabic lexical and morphol...

Multi-Level Analysis and Annotation of Arabic Corpora for Text-to-Sign Language MT

In this paper, we present an ongoing effort in lexical semantic analysis...

Automatic Romanization of Arabic Bibliographic Records

International library standards require cataloguers to tediously input R...

An Automated System for Essay Scoring of Online Exams in Arabic based on Stemming Techniques and Levenshtein Edit Operations

In this article, an automated system is proposed for essay scoring in Ar...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.