Log In Sign Up

Eating Garlic Prevents COVID-19 Infection: Detecting Misinformation on the Arabic Content of Twitter

by   Sarah Alqurashi, et al.

The rapid growth of social media content during the current pandemic provides useful tools for disseminating information which has also become a root for misinformation. Therefore, there is an urgent need for fact-checking and effective techniques for detecting misinformation in social media. In this work, we study the misinformation in the Arabic content of Twitter. We construct a large Arabic dataset related to COVID-19 misinformation and gold-annotate the tweets into two categories: misinformation or not. Then, we apply eight different traditional and deep machine learning models, with different features including word embeddings and word frequency. The word embedding models (FastText and word2vec) exploit more than two million Arabic tweets related to COVID-19. Experiments show that optimizing the area under the curve (AUC) improves the models' performance and the Extreme Gradient Boosting (XGBoost) presents the highest accuracy in detecting COVID-19 misinformation online.


Large Arabic Twitter Dataset on COVID-19

The 2019 coronavirus disease (COVID-19), emerged late December 2019 in C...

Detecting Potentially Harmful and Protective Suicide-related Content on Twitter: A Machine Learning Approach

Research shows that exposure to suicide-related news media content is as...

How COVID-19 Is Changing Our Language : Detecting Semantic Shift in Twitter Word Embeddings

Words are malleable objects, influenced by events that are reflected in ...

Detecting weak and strong Islamophobic hate speech on social media

Islamophobic hate speech on social media inflicts considerable harm on b...

Using Arabic Tweets to Understand Drug Selling Behaviors

Twitter is a popular platform for e-commerce in the Arab region includin...

BERT Transformer model for Detecting Arabic GPT2 Auto-Generated Tweets

During the last two decades, we have progressively turned to the Interne...