MMCoVaR: Multimodal COVID-19 Vaccine Focused Data Repository for Fake News Detection and a Baseline Architecture for Classification

09/14/2021
by   Mingxuan Chen, et al.
0

The outbreak of COVID-19 has resulted in an "infodemic" that has encouraged the propagation of misinformation about COVID-19 and cure methods which, in turn, could negatively affect the adoption of recommended public health measures in the larger population. In this paper, we provide a new multimodal (consisting of images, text and temporal information) labeled dataset containing news articles and tweets on the COVID-19 vaccine. We collected 2,593 news articles from 80 publishers for one year between Feb 16th 2020 to May 8th 2021 and 24184 Twitter posts (collected between April 17th 2021 to May 8th 2021). We combine ratings from two news media ranking sites: Medias Bias Chart and Media Bias/Fact Check (MBFC) to classify the news dataset into two levels of credibility: reliable and unreliable. The combination of two filters allows for higher precision of labeling. We also propose a stance detection mechanism to annotate tweets into three levels of credibility: reliable, unreliable and inconclusive. We provide several statistics as well as other analytics like, publisher distribution, publication date distribution, topic analysis, etc. We also provide a novel architecture that classifies the news data into misinformation or truth to provide a baseline performance for this dataset. We find that the proposed architecture has an F-Score of 0.919 and accuracy of 0.882 for fake news detection. Furthermore, we provide benchmark performance for misinformation detection on tweet dataset. This new multimodal dataset can be used in research on COVID-19 vaccine, including misinformation detection, influence of fake COVID-19 vaccine information, etc.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2020

ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research

First identified in Wuhan, China, in December 2019, the outbreak of COVI...
research
02/17/2021

Cross-SEAN: A Cross-Stitch Semi-Supervised Neural Attention Model for COVID-19 Fake News Detection

As the COVID-19 pandemic sweeps across the world, it has been accompanie...
research
09/04/2021

Supervised Contrastive Learning for Multimodal Unreliable News Detection in COVID-19 Pandemic

As the digital news industry becomes the main channel of information dis...
research
07/01/2021

Tackling COVID-19 Infodemic using Deep Learning

Humanity is battling one of the most deleterious virus in modern history...
research
09/13/2022

CovidMis20: COVID-19 Misinformation Detection System on Twitter Tweets using Deep Learning Models

Online news and information sources are convenient and accessible ways t...
research
08/30/2020

QMUL-SDS at CheckThat! 2020: Determining COVID-19 Tweet Check-Worthiness Using an Enhanced CT-BERT with Numeric Expressions

This paper describes the participation of the QMUL-SDS team for Task 1 o...
research
12/23/2020

Fake News Data Collection and Classification: Iterative Query Selection for Opaque Search Engines with Pseudo Relevance Feedback

Retrieving information from an online search engine is the first and mos...

Please sign up or login with your details

Forgot password? Click here to reset