Doing Good or Doing Right? Exploring the Weakness of Commonsense Causal Reasoning Models

07/05/2021
by   Mingyue Han, et al.
0

Pretrained language models (PLM) achieve surprising performance on the Choice of Plausible Alternatives (COPA) task. However, whether PLMs have truly acquired the ability of causal reasoning remains a question. In this paper, we investigate the problem of semantic similarity bias and reveal the vulnerability of current COPA models by certain attacks. Previous solutions that tackle the superficial cues of unbalanced token distribution still encounter the same problem of semantic bias, even more seriously due to the utilization of more training data. We mitigate this problem by simply adding a regularization loss and experimental results show that this solution not only improves the model's generalization ability, but also assists the models to perform more robustly on a challenging dataset, BCOPA-CE, which has unbiased token distribution and is more difficult for models to distinguish cause and effect.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/13/2021

Improving Commonsense Causal Reasoning by Adversarial Training and Data Augmentation

Determining the plausibility of causal relations between clauses is a co...
research
11/01/2019

When Choosing Plausible Alternatives, Clever Hans can be Clever

Pretrained language models, such as BERT and RoBERTa, have shown large i...
research
08/15/2021

Exploring Generalization Ability of Pretrained Language Models on Arithmetic and Logical Reasoning

To quantitatively and intuitively explore the generalization ability of ...
research
05/12/2020

WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge

In this paper, we present the first comprehensive categorization of esse...
research
04/26/2023

Impact of Position Bias on Language Models in Token Classification

Language Models (LMs) have shown state-of-the-art performance in Natural...
research
06/20/2023

FAIR: A Causal Framework for Accurately Inferring Judgments Reversals

Artificial intelligence researchers have made significant advances in le...
research
10/08/2020

Precise Task Formalization Matters in Winograd Schema Evaluations

Performance on the Winograd Schema Challenge (WSC), a respected English ...

Please sign up or login with your details

Forgot password? Click here to reset