Towards Arabic Sentence Simplification via Classification and Generative Approaches

04/20/2022
by   Nouran Khallaf, et al.
0

This paper presents an attempt to build a Modern Standard Arabic (MSA) sentence-level simplification system. We experimented with sentence simplification using two approaches: (i) a classification approach leading to lexical simplification pipelines which use Arabic-BERT, a pre-trained contextualised model, as well as a model of fastText word embeddings; and (ii) a generative approach, a Seq2Seq technique by applying a multilingual Text-to-Text Transfer Transformer mT5. We developed our training corpus by aligning the original and simplified sentences from the internationally acclaimed Arabic novel "Saaq al-Bambuu". We evaluate effectiveness of these methods by comparing the generated simple sentences to the target simple sentences using the BERTScore evaluation metric. The simple sentences produced by the mT5 model achieve P 0.72, R 0.68 and F-1 0.70 via BERTScore, while, combining Arabic-BERT and fastText achieves P 0.97, R 0.97 and F-1 0.97. In addition, we report a manual error analysis for these experiments. <https://github.com/Nouran-Khallaf/Lexical_Simplification>

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2021

Automatic Difficulty Classification of Arabic Sentences

In this paper, we present a Modern Standard Arabic (MSA) Sentence diffic...
research
01/22/2021

BERT Transformer model for Detecting Arabic GPT2 Auto-Generated Tweets

During the last two decades, we have progressively turned to the Interne...
research
04/14/2023

SimpLex: a lexical text simplification architecture

Text simplification (TS) is the process of generating easy-to-understand...
research
07/31/2018

An Enhanced Latent Semantic Analysis Approach for Arabic Document Summarization

The fast-growing amount of information on the Internet makes the researc...
research
05/12/2021

UIUC_BioNLP at SemEval-2021 Task 11: A Cascade of Neural Models for Structuring Scholarly NLP Contributions

We propose a cascade of neural models that performs sentence classificat...
research
10/20/2020

Text Classification of COVID-19 Press Briefings using BERT and Convolutional Neural Networks

We build a sentence-level political discourse classifier using existing ...
research
07/12/2023

Ashaar: Automatic Analysis and Generation of Arabic Poetry Using Deep Learning Approaches

Poetry holds immense significance within the cultural and traditional fa...

Please sign up or login with your details

Forgot password? Click here to reset