Stanceosaurus: Classifying Stance Towards Multilingual Misinformation

10/28/2022
by   Jonathan Zheng, et al.
0

We present Stanceosaurus, a new corpus of 28,033 tweets in English, Hindi, and Arabic annotated with stance towards 251 misinformation claims. As far as we are aware, it is the largest corpus annotated with stance towards misinformation claims. The claims in Stanceosaurus originate from 15 fact-checking sources that cover diverse geographical regions and cultures. Unlike existing stance datasets, we introduce a more fine-grained 5-class labeling strategy with additional subcategories to distinguish implicit stance. Pre-trained transformer-based stance classifiers that are fine-tuned on our corpus show good generalization on unseen claims and regional claims from countries outside the training data. Cross-lingual experiments demonstrate Stanceosaurus' capability of training multi-lingual models, achieving 53.1 F1 on Hindi and 50.4 F1 on Arabic without any target-language fine-tuning. Finally, we show how a domain adaptation method can be used to improve performance on Stanceosaurus using additional RumourEval-2019 data. We make Stanceosaurus publicly available to the research community and hope it will encourage further work on misinformation identification across languages and cultures.

READ FULL TEXT
research
05/23/2023

Towards Massively Multi-domain Multilingual Readability Assessment

We present ReadMe++, a massively multi-domain multilingual dataset for a...
research
11/09/2022

Cross-lingual Transfer Learning for Check-worthy Claim Identification over Twitter

Misinformation spread over social media has become an undeniable infodem...
research
09/05/2020

Accenture at CheckThat! 2020: If you say so: Post-hoc fact-checking of claims using transformer-based models

We introduce the strategies used by the Accenture Team for the CLEF2020 ...
research
10/13/2020

Model Selection for Cross-Lingual Transfer using a Learned Scoring Function

Transformers that are pre-trained on multilingual text corpora, such as,...
research
09/19/2023

Prompt, Condition, and Generate: Classification of Unsupported Claims with In-Context Learning

Unsupported and unfalsifiable claims we encounter in our daily lives can...
research
08/10/2023

Finding Already Debunked Narratives via Multistage Retrieval: Enabling Cross-Lingual, Cross-Dataset and Zero-Shot Learning

The task of retrieving already debunked narratives aims to detect storie...
research
05/16/2019

IMHO Fine-Tuning Improves Claim Detection

Claims are the central component of an argument. Detecting claims across...

Please sign up or login with your details

Forgot password? Click here to reset