Improving the Out-Of-Distribution Generalization Capability of Language Models: Counterfactually-Augmented Data is not Enough

02/18/2023
by   Caoyun Fan, et al.
0

Counterfactually-Augmented Data (CAD) has the potential to improve language models' Out-Of-Distribution (OOD) generalization capability, as CAD induces language models to exploit causal features and exclude spurious correlations. However, the empirical results of OOD generalization on CAD are not as efficient as expected. In this paper, we attribute the inefficiency to Myopia Phenomenon caused by CAD: language models only focus on causal features that are edited in the augmentation and exclude other non-edited causal features. As a result, the potential of CAD is not fully exploited. Based on the structural properties of CAD, we design two additional constraints to help language models extract more complete causal features contained in CAD, thus improving the OOD generalization capability. We evaluate our method on two tasks: Sentiment Analysis and Natural Language Inference, and the experimental results demonstrate that our method could unlock CAD's potential and improve language models' OOD generalization capability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2021

An Investigation of the (In)effectiveness of Counterfactually Augmented Data

While pretrained language models achieve excellent performance on natura...
research
08/24/2023

Causal Parrots: Large Language Models May Talk Causality But Are Not Causal

Some argue scale is all what is needed to achieve AI, covering even caus...
research
05/09/2022

Counterfactually Augmented Data and Unintended Bias: The Case of Sexism and Hate Speech Detection

Counterfactually Augmented Data (CAD) aims to improve out-of-domain gene...
research
05/24/2023

Trusting Your Evidence: Hallucinate Less with Context-aware Decoding

Language models (LMs) often struggle to pay enough attention to the inpu...
research
09/14/2021

How Does Counterfactually Augmented Data Impact Models for Social Computing Constructs?

As NLP models are increasingly deployed in socially situated settings su...
research
11/29/2022

AutoCAD: Automatically Generating Counterfactuals for Mitigating Shortcut Learning

Recent studies have shown the impressive efficacy of counterfactually au...
research
04/25/2023

What's in a Name? Evaluating Assembly-Part Semantic Knowledge in Language Models through User-Provided Names in CAD Files

Semantic knowledge of part-part and part-whole relationships in assembli...

Please sign up or login with your details

Forgot password? Click here to reset