CrossAug: A Contrastive Data Augmentation Method for Debiasing Fact Verification Models

09/30/2021
by   Minwoo Lee, et al.
0

Fact verification datasets are typically constructed using crowdsourcing techniques due to the lack of text sources with veracity labels. However, the crowdsourcing process often produces undesired biases in data that cause models to learn spurious patterns. In this paper, we propose CrossAug, a contrastive data augmentation method for debiasing fact verification models. Specifically, we employ a two-stage augmentation pipeline to generate new claims and evidences from existing samples. The generated samples are then paired cross-wise with the original pair, forming contrastive samples that facilitate the model to rely less on spurious patterns and learn more robust representations. Experimental results show that our method outperforms the previous state-of-the-art debiasing technique by 3.6 of the FEVER dataset, with a total performance boost of 10.13 baseline. Furthermore, we evaluate our approach in data-scarce settings, where models can be more susceptible to biases due to the lack of training data. Experimental results demonstrate that our approach is also effective at debiasing in these low-resource conditions, exceeding the baseline performance on the Symmetric dataset with just 1

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/07/2022

Enhancing Semantic Code Search with Multimodal Contrastive Learning and Soft Data Augmentation

Code search aims to retrieve the most semantically relevant code snippet...
research
11/23/2022

Mitigating Data Sparsity for Short Text Topic Modeling by Topic-Semantic Contrastive Learning

To overcome the data sparsity issue in short text topic modeling, existi...
research
06/06/2022

Global Mixup: Eliminating Ambiguity with Clustering

Data augmentation with Mixup has been proven an effective method to regu...
research
09/10/2021

Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning

We introduce EfficientCL, a memory-efficient continual pretraining metho...
research
07/04/2023

Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A Two-Stage Approach to Mitigate Social Biases

As the representation capability of Pre-trained Language Models (PLMs) i...
research
04/15/2022

DialAug: Mixing up Dialogue Contexts in Contrastive Learning for Robust Conversational Modeling

Retrieval-based conversational systems learn to rank response candidates...
research
06/29/2021

Exploring the Efficacy of Automatically Generated Counterfactuals for Sentiment Analysis

While state-of-the-art NLP models have been achieving the excellent perf...

Please sign up or login with your details

Forgot password? Click here to reset