Multi-modal Identification of State-Sponsored Propaganda on Social Media

12/24/2020
by   Xiaobo Guo, et al.
0

The prevalence of state-sponsored propaganda on the Internet has become a cause for concern in the recent years. While much effort has been made to identify state-sponsored Internet propaganda, the problem remains far from being solved because the ambiguous definition of propaganda leads to unreliable data labelling, and the huge amount of potential predictive features causes the models to be inexplicable. This paper is the first attempt to build a balanced dataset for this task. The dataset is comprised of propaganda by three different organizations across two time periods. A multi-model framework for detecting propaganda messages solely based on the visual and textual content is proposed which achieves a promising performance on detecting propaganda by the three organizations both for the same time period (training and testing on data from the same time period) (F1=0.869) and for different time periods (training on past, testing on future) (F1=0.697). To reduce the influence of false positive predictions, we change the threshold to test the relationship between the false positive and true positive rates and provide explanations for the predictions made by our models with visualization tools to enhance the interpretability of our framework. Our new dataset and general framework provide a strong benchmark for the task of identifying state-sponsored Internet propaganda and point out a potential path for future work on this task.

READ FULL TEXT

page 1

page 2

page 7

research
09/23/2020

MAFF-Net: Filter False Positive for 3D Vehicle Detection with Multi-modal Adaptive Feature Fusion

3D vehicle detection based on multi-modal fusion is an important task of...
research
06/22/2022

Influence of uncertainty estimation techniques on false-positive reduction in liver lesion detection

Deep learning techniques show success in detecting objects in medical im...
research
08/30/2021

An Enhanced Machine Learning Topic Classification Methodology for Cybersecurity

In this research, we use user defined labels from three internet text so...
research
07/25/2021

On-Device Content Moderation

With the advent of internet, not safe for work(NSFW) content moderation ...
research
05/16/2022

Automatic Error Classification and Root Cause Determination while Replaying Recorded Workload Data at SAP HANA

Capturing customer workloads of database systems to replay these workloa...
research
03/27/2023

Optimizing Lead Time in Fall Detection for a Planar Bipedal Robot

For legged robots to operate in complex terrains, they must be robust to...
research
10/20/2020

FishNet: A Unified Embedding for Salmon Recognition

Identifying individual salmon can be very beneficial for the aquaculture...

Please sign up or login with your details

Forgot password? Click here to reset