CONCRETE: Improving Cross-lingual Fact-checking with Cross-lingual Retrieval

09/05/2022
by   Kung-Hsiang Huang, et al.
0

Fact-checking has gained increasing attention due to the widespread of falsified information. Most fact-checking approaches focus on claims made in English only due to the data scarcity issue in other languages. The lack of fact-checking datasets in low-resource languages calls for an effective cross-lingual transfer technique for fact-checking. Additionally, trustworthy information in different languages can be complementary and helpful in verifying facts. To this end, we present the first fact-checking framework augmented with cross-lingual retrieval that aggregates evidence retrieved from multiple languages through a cross-lingual retriever. Given the absence of cross-lingual information retrieval datasets with claim-like queries, we train the retriever with our proposed Cross-lingual Inverse Cloze Task (X-ICT), a self-supervised algorithm that creates training instances by translating the title of a passage. The goal for X-ICT is to learn cross-lingual retrieval in which the model learns to identify the passage corresponding to a given translated title. On the X-Fact dataset, our approach achieves 2.23 F1 improvement in the zero-shot cross-lingual setup over prior systems. The source code and data are publicly available at https://github.com/khuangaf/CONCRETE.

READ FULL TEXT
research
02/01/2022

XAlign: Cross-lingual Fact-to-Text Alignment and Generation for Low-Resource Languages

Multiple critical scenarios (like Wikipedia text generation given Englis...
research
08/10/2023

Finding Already Debunked Narratives via Multistage Retrieval: Enabling Cross-Lingual, Cross-Dataset and Zero-Shot Learning

The task of retrieving already debunked narratives aims to detect storie...
research
12/16/2020

Multilingual Evidence Retrieval and Fact Verification to Combat Global Disinformation: The Power of Polyglotism

This article investigates multilingual evidence retrieval and fact verif...
research
02/09/2023

Massively Multilingual Language Models for Cross Lingual Fact Extraction from Low Resource Indian Languages

Massive knowledge graphs like Wikidata attempt to capture world knowledg...
research
05/23/2023

Detecting and Mitigating Hallucinations in Multilingual Summarisation

Hallucinations pose a significant challenge to the reliability of neural...
research
03/28/2023

NeuralMind-UNICAMP at 2022 TREC NeuCLIR: Large Boring Rerankers for Cross-lingual Retrieval

This paper reports on a study of cross-lingual information retrieval (CL...
research
09/22/2022

XF2T: Cross-lingual Fact-to-Text Generation for Low-Resource Languages

Multiple business scenarios require an automated generation of descripti...

Please sign up or login with your details

Forgot password? Click here to reset