Learning to Learn to be Right for the Right Reasons

04/23/2021
by   Pride Kavumba, et al.
0

Improving model generalization on held-out data is one of the core objectives in commonsense reasoning. Recent work has shown that models trained on the dataset with superficial cues tend to perform well on the easy test set with superficial cues but perform poorly on the hard test set without superficial cues. Previous approaches have resorted to manual methods of encouraging models not to overfit to superficial cues. While some of the methods have improved performance on hard instances, they also lead to degraded performance on easy instances. Here, we propose to explicitly learn a model that does well on both the easy test set with superficial cues and hard test set without superficial cues. Using a meta-learning objective, we learn such a model that improves performance on both the easy test set and the hard test set. By evaluating our models on Choice of Plausible Alternatives (COPA) and Commonsense Explanation, we show that our proposed method leads to improved performance on both the easy test set and the hard test set upon which we observe up to 16.5 percentage points improvement over the baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/12/2021

SuperSim: a test set for word similarity and relatedness in Swedish

Language models are notoriously difficult to evaluate. We release SuperS...
research
11/01/2019

When Choosing Plausible Alternatives, Clever Hans can be Clever

Pretrained language models, such as BERT and RoBERTa, have shown large i...
research
08/14/2019

Towards Debiasing Fact Verification Models

Fact verification requires validating a claim in the context of evidence...
research
10/06/2021

Which Shortcut Cues Will DNNs Choose? A Study from the Parameter-Space Perspective

Deep neural networks (DNNs) often rely on easy-to-learn discriminatory f...
research
08/01/2019

Visual cues in estimation of part-to-whole comparison

Pie charts were first published in 1801 by William Playfair and have cau...
research
07/01/1997

A New Look at the Easy-Hard-Easy Pattern of Combinatorial Search Difficulty

The easy-hard-easy pattern in the difficulty of combinatorial search pro...
research
09/28/2021

Symbolic Brittleness in Sequence Models: on Systematic Generalization in Symbolic Mathematics

Neural sequence models trained with maximum likelihood estimation have l...

Please sign up or login with your details

Forgot password? Click here to reset