An Enhanced Corpus for Arabic Newspapers Comments

02/08/2021
by   Hichem Rahab, et al.
0

In this paper, we propose our enhanced approach to create a dedicated corpus for Algerian Arabic newspapers comments. The developed approach has to enhance an existing approach by the enrichment of the available corpus and the inclusion of the annotation step by following the Model Annotate Train Test Evaluate Revise (MATTER) approach. A corpus is created by collecting comments from web sites of three well know Algerian newspapers. Three classifiers, support vector machines, naïve Bayes, and k-nearest neighbors, were used for classification of comments into positive and negative classes. To identify the influence of the stemming in the obtained results, the classification was tested with and without stemming. Obtained results show that stemming does not enhance considerably the classification due to the nature of Algerian comments tied to Algerian Arabic Dialect. The promising results constitute a motivation for us to improve our approach especially in dealing with non Arabic sentences, especially Dialectal and French ones.

READ FULL TEXT
research
05/31/2020

SANA : Sentiment Analysis on Newspapers comments in Algeria

It is very current in today life to seek for tracking the people opinion...
research
11/09/2019

Subjective Sentiment Analysis for Arabic Newswire Comments

This paper presents an approach based on supervised machine learning met...
research
01/08/2021

Effect of Word Embedding Variable Parameters on Arabic Sentiment Analysis Performance

Social media such as Twitter, Facebook, etc. has led to a generated grow...
research
03/18/2019

Sentiment Analysis on IMDB Movie Comments and Twitter Data by Machine Learning and Vector Space Techniques

This study's goal is to create a model of sentiment analysis on a 2000 r...
research
08/15/2018

SentiALG: Automated Corpus Annotation for Algerian Sentiment Analysis

Data annotation is an important but time-consuming and costly procedure....
research
01/18/2022

TYPIC: A Corpus of Template-Based Diagnostic Comments on Argumentation

Providing feedback on the argumentation of learner is essential for deve...
research
05/29/2018

Automatic Identification of Arabic expressions related to future events in Lebanon's economy

In this paper, we propose a method to automatically identify future even...

Please sign up or login with your details

Forgot password? Click here to reset