Misinformation detection in Luganda-English code-mixed social media text

03/31/2021
by   Peter Nabende, et al.
0

The increasing occurrence, forms, and negative effects of misinformation on social media platforms has necessitated more misinformation detection tools. Currently, work is being done addressing COVID-19 misinformation however, there are no misinformation detection tools for any of the 40 distinct indigenous Ugandan languages. This paper addresses this gap by presenting basic language resources and a misinformation detection data set based on code-mixed Luganda-English messages sourced from the Facebook and Twitter social media platforms. Several machine learning methods are applied on the misinformation detection data set to develop classification models for detecting whether a code-mixed Luganda-English message contains misinformation or not. A 10-fold cross validation evaluation of the classification methods in an experimental misinformation detection task shows that a Discriminative Multinomial Naive Bayes (DMNB) method achieves the highest accuracy and F-measure of 78.19 77.90 classification models achieve comparable results. These results are promising since the machine learning models are based on n-gram features from only the misinformation detection dataset.

READ FULL TEXT
research
02/15/2018

JU_KS@SAIL_CodeMixed-2017: Sentiment Analysis for Indian Code Mixed Social Media Texts

This paper reports about our work in the NLP Tool Contest @ICON-2017, sh...
research
02/05/2022

A Survey on Automated Sarcasm Detection on Twitter

Automatic sarcasm detection is a growing field in computer science. Shor...
research
01/15/2020

AggressionNet: Generalised Multi-Modal Deep Temporal and Sequential Learning for Aggression Identification

Wide usage of social media platforms has increased the risk of aggressio...
research
04/19/2018

Utilizing Neural Networks and Linguistic Metadata for Early Detection of Depression Indications in Text Sequences

Depression is ranked as the largest contributor to global disability and...
research
01/15/2020

A Unified System for Aggression Identification in English Code-Mixed and Uni-Lingual Texts

Wide usage of social media platforms has increased the risk of aggressio...
research
12/30/2019

"Hinglish" Language – Modeling a Messy Code-Mixed Language

With a sharp rise in fluency and users of "Hinglish" in linguistically d...
research
10/17/2020

CUSATNLP@HASOC-Dravidian-CodeMix-FIRE2020:Identifying Offensive Language from ManglishTweets

With the popularity of social media, communications through blogs, Faceb...

Please sign up or login with your details

Forgot password? Click here to reset