DeepAI AI Chat
Log In Sign Up

Rational Kernels for Arabic Stemming and Text Classification

by   Attia Nehar, et al.
Université de Rouen
Universite Amar Telidji Laghouat

In this paper, we address the problems of Arabic Text Classification and stemming using Transducers and Rational Kernels. We introduce a new stemming technique based on the use of Arabic patterns (Pattern Based Stemmer). Patterns are modelled using transducers and stemming is done without depending on any dictionary. Using transducers for stemming, documents are transformed into finite state transducers. This document representation allows us to use and explore rational kernels as a framework for Arabic Text Classification. Stemming experiments are conducted on three word collections and classification experiments are done on the Saudi Press Agency dataset. Results show that our approach, when compared with other approaches, is promising specially in terms of Accuracy, Recall and F1.


page 9

page 10

page 11


AraDIC: Arabic Document Classification using Image-Based Character Embeddings and Class-Balanced Loss

Classical and some deep learning techniques for Arabic text classificati...

OSACT4 Shared Task on Offensive Language Detection: Intensive Preprocessing-Based Approach

The preprocessing phase is one of the key phases within the text classif...

Data Augmentation using Transformers and Similarity Measures for Improving Arabic Text Classification

Learning models are highly dependent on data to work effectively, and th...

Rational Kernels: A survey

Many kinds of data are naturally amenable to being treated as sequences....

Optimizing Deep Learning Model Parameters with the Bees Algorithm for Improved Medical Text Classification

This paper introduces a novel mechanism to obtain the optimal parameters...

Evaluating Various Tokenizers for Arabic Text Classification

The first step in any NLP pipeline is learning word vector representatio...

Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification

Text classification is one of the challenging computational tasks in mac...