CrossAug: A Contrastive Data Augmentation Method for Debiasing Fact Verification Models

by   Minwoo Lee, et al.
Seoul National University

Fact verification datasets are typically constructed using crowdsourcing techniques due to the lack of text sources with veracity labels. However, the crowdsourcing process often produces undesired biases in data that cause models to learn spurious patterns. In this paper, we propose CrossAug, a contrastive data augmentation method for debiasing fact verification models. Specifically, we employ a two-stage augmentation pipeline to generate new claims and evidences from existing samples. The generated samples are then paired cross-wise with the original pair, forming contrastive samples that facilitate the model to rely less on spurious patterns and learn more robust representations. Experimental results show that our method outperforms the previous state-of-the-art debiasing technique by 3.6 of the FEVER dataset, with a total performance boost of 10.13 baseline. Furthermore, we evaluate our approach in data-scarce settings, where models can be more susceptible to biases due to the lack of training data. Experimental results demonstrate that our approach is also effective at debiasing in these low-resource conditions, exceeding the baseline performance on the Symmetric dataset with just 1


page 1

page 2

page 3

page 4


Enhancing Semantic Code Search with Multimodal Contrastive Learning and Soft Data Augmentation

Code search aims to retrieve the most semantically relevant code snippet...

Mitigating Data Sparsity for Short Text Topic Modeling by Topic-Semantic Contrastive Learning

To overcome the data sparsity issue in short text topic modeling, existi...

Global Mixup: Eliminating Ambiguity with Clustering

Data augmentation with Mixup has been proven an effective method to regu...

Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning

We introduce EfficientCL, a memory-efficient continual pretraining metho...

Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A Two-Stage Approach to Mitigate Social Biases

As the representation capability of Pre-trained Language Models (PLMs) i...

DialAug: Mixing up Dialogue Contexts in Contrastive Learning for Robust Conversational Modeling

Retrieval-based conversational systems learn to rank response candidates...

Exploring the Efficacy of Automatically Generated Counterfactuals for Sentiment Analysis

While state-of-the-art NLP models have been achieving the excellent perf...

Please sign up or login with your details

Forgot password? Click here to reset