CoCoLM: COmplex COmmonsense Enhanced Language Model

12/31/2020
by   Changlong Yu, et al.
0

Large-scale pre-trained language models have demonstrated strong knowledge representation ability. However, recent studies suggest that even though these giant models contains rich simple commonsense knowledge (e.g., bird can fly and fish can swim.), they often struggle with the complex commonsense knowledge that involves multiple eventualities (verb-centric phrases, e.g., identifying the relationship between “Jim yells at Bob” and “Bob is upset”).To address this problem, in this paper, we propose to help pre-trained language models better incorporate complex commonsense knowledge. Different from existing fine-tuning approaches, we do not focus on a specific task and propose a general language model named CoCoLM. Through the careful training over a large-scale eventuality knowledge graphs ASER, we successfully teach pre-trained language models (i.e., BERT and RoBERTa) rich complex commonsense knowledge among eventualities. Experiments on multiple downstream commonsense tasks that requires the correct understanding of eventualities demonstrate the effectiveness of CoCoLM.

READ FULL TEXT

page 1

page 2

page 3

page 4

09/06/2021

Enhancing Language Models with Plug-and-Play Large-Scale Commonsense

We study how to enhance language models (LMs) with textual commonsense k...
08/10/2020

Does BERT Solve Commonsense Task via Commonsense Knowledge?

The success of pre-trained contextualized language models such as BERT m...
10/22/2020

Language Models are Open Knowledge Graphs

This paper shows how to construct knowledge graphs (KGs) from pre-traine...
06/22/2021

Do Language Models Perform Generalizable Commonsense Inference?

Inspired by evidence that pretrained language models (LMs) encode common...
08/18/2021

It’s Common Sense, isn’t it? Demystifying Human Evaluations in Commonsense-enhanced NLG systems

Common sense is an integral part of human cognition which allows us to m...
05/24/2022

GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models

Recent work has shown that Pre-trained Language Models (PLMs) have the a...
04/12/2022

Mining Logical Event Schemas From Pre-Trained Language Models

We present NESL (the Neuro-Episodic Schema Learner), an event schema lea...