Machine Generation and Detection of Arabic Manipulated and Fake News

11/05/2020
by   El Moatez Billah Nagoudi, et al.
0

Fake news and deceptive machine-generated text are serious problems threatening modern societies, including in the Arab world. This motivates work on detecting false and manipulated stories online. However, a bottleneck for this research is lack of sufficient data to train detection models. We present a novel method for automatically generating Arabic manipulated (and potentially fake) news stories. Our method is simple and only depends on availability of true stories, which are abundant online, and a part of speech tagger (POS). To facilitate future work, we dispense with both of these requirements altogether by providing AraNews, a novel and large POS-tagged news dataset that can be used off-the-shelf. Using stories generated based on AraNews, we carry out a human annotation study that casts light on the effects of machine manipulation on text veracity. The study also measures human ability to detect Arabic machine manipulated text generated by our method. Finally, we develop the first models for detecting manipulated Arabic news and achieve state-of-the-art results on Arabic fake news detection (macro F1=70.06). Our models and data are publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/06/2022

Arabic Fake News Detection Based on Deep Contextualized Embedding Models

Social media is becoming a source of news for many people due to its eas...
research
09/16/2020

Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News

Large-scale dissemination of disinformation online intended to mislead o...
research
05/07/2021

AraCOVID19-MFH: Arabic COVID-19 Multi-label Fake News and Hate Speech Detection Dataset

Along with the COVID-19 pandemic, an "infodemic" of false and misleading...
research
07/30/2023

A Private Watermark for Large Language Models

Recently, text watermarking algorithms for large language models (LLMs) ...
research
05/10/2018

Fighting Fake News: Image Splice Detection via Learned Self-Consistency

Advances in photo editing and manipulation tools have made it significan...
research
08/17/2017

Simple Open Stance Classification for Rumour Analysis

Stance classification determines the attitude, or stance, in a (typicall...
research
01/22/2021

BERT Transformer model for Detecting Arabic GPT2 Auto-Generated Tweets

During the last two decades, we have progressively turned to the Interne...

Please sign up or login with your details

Forgot password? Click here to reset