Evaluating Generalizability of Fine-Tuned Models for Fake News Detection

05/15/2022
by   Abhijit Suprem, et al.
2

The Covid-19 pandemic has caused a dramatic and parallel rise in dangerous misinformation, denoted an `infodemic' by the CDC and WHO. Misinformation tied to the Covid-19 infodemic changes continuously; this can lead to performance degradation of fine-tuned models due to concept drift. Degredation can be mitigated if models generalize well-enough to capture some cyclical aspects of drifted data. In this paper, we explore generalizability of pre-trained and fine-tuned fake news detectors across 9 fake news datasets. We show that existing models often overfit on their training dataset and have poor performance on unseen data. However, on some subsets of unseen data that overlap with training data, models have higher accuracy. Based on this observation, we also present KMeans-Proxy, a fast and effective method based on K-Means clustering for quickly identifying these overlapping subsets of unseen data. KMeans-Proxy improves generalizability on unseen fake news datasets by 0.1-0.2 f1-points across datasets. We present both our generalizability experiments as well as KMeans-Proxy to further research in tackling the fake news problem.

READ FULL TEXT

page 3

page 5

research
05/19/2022

MiDAS: Multi-integrated Domain Adaptive Supervision for Fake News Detection

COVID-19 related misinformation and fake news, coined an 'infodemic', ha...
research
01/28/2021

A transformer based approach for fighting COVID-19 fake news

The rapid outbreak of COVID-19 has caused humanity to come to a stand-st...
research
01/03/2022

An Adversarial Benchmark for Fake News Detection Models

With the proliferation of online misinformation, fake news detection has...
research
10/25/2021

Generating artificial texts as substitution or complement of training data

The quality of artificially generated texts has considerably improved wi...
research
09/09/2023

Analysis of Disinformation and Fake News Detection Using Fine-Tuned Large Language Model

The paper considers the possibility of fine-tuning Llama 2 large languag...
research
07/22/2023

Identifying Misinformation on YouTube through Transcript Contextual Analysis with Transformer Models

Misinformation on YouTube is a significant concern, necessitating robust...
research
04/12/2021

On Unifying Misinformation Detection

In this paper, we introduce UnifiedM2, a general-purpose misinformation ...

Please sign up or login with your details

Forgot password? Click here to reset