A Speech Act Classifier for Persian Texts and its Application in Identify Speech Act of Rumors

Speech Acts (SAs) are one of the important areas of pragmatics, which give us a better understanding of the state of mind of the people and convey an intended language function. Knowledge of the SA of a text can be helpful in analyzing that text in natural language processing applications. This study presents a dictionary-based statistical technique for Persian SA recognition. The proposed technique classifies a text into seven classes of SA based on four criteria: lexical, syntactic, semantic, and surface features. WordNet as the tool for extracting synonym and enriching features dictionary is utilized. To evaluate the proposed technique, we utilized four classification methods including Random Forest (RF), Support Vector Machine (SVM), Naive Bayes (NB), and K-Nearest Neighbors (KNN). The experimental results demonstrate that the proposed method using RF and SVM as the best classifiers achieved a state-of-the-art performance with an accuracy of 0.95 for classification of Persian SAs. Our original vision of this work is introducing an application of SA recognition on social media content, especially the common SA in rumors. Therefore, the proposed system utilized to determine the common SAs in rumors. The results showed that Persian rumors are often expressed in three SA classes including narrative, question, and threat, and in some cases with the request SA.

READ FULL TEXT

page 11

page 16

research
08/08/2023

A Comparative Study on TF-IDF feature Weighting Method and its Analysis using Unstructured Dataset

Text Classification is the process of categorizing text into the relevan...
research
05/17/2016

Tweet Acts: A Speech Act Classifier for Twitter

Speech acts are a way to conceptualize speech as action. This holds true...
research
11/14/2018

A Study of Language and Classifier-independent Feature Analysis for Vocal Emotion Recognition

Every speech signal carries implicit information about the emotions, whi...
research
08/28/2020

An Intelligent CNN-VAE Text Representation Technology Based on Text Semantics for Comprehensive Big Data

In the era of big data, a large number of text data generated by the Int...
research
11/30/2020

Procode: the Swiss Multilingual Solution for Automatic Coding and Recoding of Occupations and Economic Activities

Objective. Epidemiological studies require data that are in alignment wi...
research
02/05/2023

Machine Learning Methods for Evaluating Public Crisis: Meta-Analysis

This study examines machine learning methods used in crisis management. ...
research
10/24/2018

A Text Classification Application: Poet Detection from Poetry

With the widespread use of the internet, the size of the text data incre...

Please sign up or login with your details

Forgot password? Click here to reset