Identifying Misinformation from Website Screenshots

02/15/2021
by   Sara Abdali, et al.
0

Can the look and the feel of a website give information about the trustworthiness of an article? In this paper, we propose to use a promising, yet neglected aspect in detecting the misinformativeness: the overall look of the domain webpage. To capture this overall look, we take screenshots of news articles served by either misinformative or trustworthy web domains and leverage a tensor decomposition based semi-supervised classification technique. The proposed approach i.e., VizFake is insensitive to a number of image transformations such as converting the image to grayscale, vectorizing the image and losing some parts of the screenshots. VizFake leverages a very small amount of known labels, mirroring realistic and practical scenarios, where labels (especially for known misinformative articles), are scarce and quickly become dated. The F1 score of VizFake on a dataset of 50k screenshots of news articles spanning more than 500 domains is roughly 85 truth labels. Furthermore, tensor representations of VizFake, obtained in an unsupervised manner, allow for exploratory analysis of the data that provides valuable insights into the problem. Finally, we compare VizFake with deep transfer learning, since it is a very popular black-box approach for image classification and also well-known text text-based methods. VizFake achieves competitive accuracy with deep transfer learning models while being two orders of magnitude faster and not requiring laborious hyper-parameter tuning.

READ FULL TEXT

page 5

page 6

page 11

research
05/08/2020

HiJoD: Semi-Supervised Multi-aspect Detection of Misinformation using Hierarchical Joint Decomposition

Distinguishing between misinformation and real information is one of the...
research
04/24/2018

Semi-supervised Content-based Detection of Misinformation via Tensor Embeddings

Fake news may be intentionally created to promote economic, political an...
research
08/06/2020

aschern at SemEval-2020 Task 11: It Takes Three to Tango: RoBERTa, CRF, and Transfer Learning

We describe our system for SemEval-2020 Task 11 on Detection of Propagan...
research
12/20/2018

Transfer Learning in Astronomy: A New Machine-Learning Paradigm

The widespread dissemination of machine learning tools in science, parti...
research
07/22/2023

Identifying Misinformation on YouTube through Transcript Contextual Analysis with Transformer Models

Misinformation on YouTube is a significant concern, necessitating robust...
research
03/23/2016

BreakingNews: Article Annotation by Image and Text Processing

Building upon recent Deep Neural Network architectures, current approach...
research
01/05/2021

COVID-19: Comparative Analysis of Methods for Identifying Articles Related to Therapeutics and Vaccines without Using Labeled Data

Here we proposed an approach to analyze text classification methods base...

Please sign up or login with your details

Forgot password? Click here to reset