Exploring Strategies for Generalizable Commonsense Reasoning with Pre-trained Models

09/07/2021
by   Kaixin Ma, et al.
5

Commonsense reasoning benchmarks have been largely solved by fine-tuning language models. The downside is that fine-tuning may cause models to overfit to task-specific data and thereby forget their knowledge gained during pre-training. Recent works only propose lightweight model updates as models may already possess useful knowledge from past experience, but a challenge remains in understanding what parts and to what extent models should be refined for a given task. In this paper, we investigate what models learn from commonsense reasoning datasets. We measure the impact of three different adaptation methods on the generalization and accuracy of models. Our experiments with two models show that fine-tuning performs best, by learning both the content and the structure of the task, but suffers from overfitting and limited generalization to novel answers. We observe that alternative adaptation methods like prefix-tuning have comparable accuracy, but generalize better to unseen answers and are more robust to adversarial splits.

READ FULL TEXT
research
04/29/2020

Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning

Fine-tuning of pre-trained transformer models has become the standard ap...
research
05/24/2023

Editing Commonsense Knowledge in GPT

Memory editing methods for updating encyclopedic knowledge in transforme...
research
11/28/2022

GPT-Neo for commonsense reasoning-a theoretical and practical lens

Recent work has demonstrated substantial gains in pre-training large-sca...
research
08/03/2023

Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

Mathematical reasoning is a challenging task for large language models (...
research
10/06/2022

Modelling Commonsense Properties using Pre-Trained Bi-Encoders

Grasping the commonsense properties of everyday concepts is an important...
research
07/12/2023

Large Class Separation is not what you need for Relational Reasoning-based OOD Detection

Standard recognition approaches are unable to deal with novel categories...
research
05/25/2022

ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection

Hate speech detection is complex; it relies on commonsense reasoning, kn...

Please sign up or login with your details

Forgot password? Click here to reset