Finding Already Debunked Narratives via Multistage Retrieval: Enabling Cross-Lingual, Cross-Dataset and Zero-Shot Learning

08/10/2023
by   Iknoor Singh, et al.
0

The task of retrieving already debunked narratives aims to detect stories that have already been fact-checked. The successful detection of claims that have already been debunked not only reduces the manual efforts of professional fact-checkers but can also contribute to slowing the spread of misinformation. Mainly due to the lack of readily available data, this is an understudied problem, particularly when considering the cross-lingual task, i.e. the retrieval of fact-checking articles in a language different from the language of the online post being checked. This paper fills this gap by (i) creating a novel dataset to enable research on cross-lingual retrieval of already debunked narratives, using tweets as queries to a database of fact-checking articles; (ii) presenting an extensive experiment to benchmark fine-tuned and off-the-shelf multilingual pre-trained Transformer models for this task; and (iii) proposing a novel multistage framework that divides this cross-lingual debunk retrieval task into refinement and re-ranking stages. Results show that the task of cross-lingual retrieval of already debunked narratives is challenging and off-the-shelf Transformer models fail to outperform a strong lexical-based baseline (BM25). Nevertheless, our multistage retrieval framework is robust, outperforming BM25 in most scenarios and enabling cross-domain and zero-shot learning, without significantly harming the model's performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2022

CONCRETE: Improving Cross-lingual Fact-checking with Cross-lingual Retrieval

Fact-checking has gained increasing attention due to the widespread of f...
research
06/08/2022

Realistic Zero-Shot Cross-Lingual Transfer in Legal Topic Classification

We consider zero-shot cross-lingual transfer in legal topic classificati...
research
09/13/2021

xGQA: Cross-Lingual Visual Question Answering

Recent advances in multimodal vision and language modeling have predomin...
research
12/14/2022

Multi-task Learning for Cross-Lingual Sentiment Analysis

This paper presents a cross-lingual sentiment analysis of news articles ...
research
04/11/2020

LAReQA: Language-agnostic answer retrieval from a multilingual pool

We present LAReQA, a challenging new benchmark for language-agnostic ans...
research
10/28/2022

Stanceosaurus: Classifying Stance Towards Multilingual Misinformation

We present Stanceosaurus, a new corpus of 28,033 tweets in English, Hind...

Please sign up or login with your details

Forgot password? Click here to reset