Does BERT Solve Commonsense Task via Commonsense Knowledge?

08/10/2020
by   Leyang Cui, et al.
0

The success of pre-trained contextualized language models such as BERT motivates a line of work that investigates linguistic knowledge inside such models in order to explain the huge improvement in downstream tasks. While previous work shows syntactic, semantic and word sense knowledge in BERT, little work has been done on investigating how BERT solves CommonsenseQA tasks. In particular, it is an interesting research question whether BERT relies on shallow syntactic patterns or deeper commonsense knowledge for disambiguation. We propose two attention-based methods to analyze commonsense knowledge inside BERT, and the contribution of such knowledge for the model prediction. We find that attention heads successfully capture the structured commonsense knowledge encoded in ConceptNet, which helps BERT solve commonsense tasks directly. Fine-tuning further makes BERT learn to use the commonsense knowledge on higher layers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/31/2020

CoCoLM: COmplex COmmonsense Enhanced Language Model

Large-scale pre-trained language models have demonstrated strong knowled...
research
10/12/2021

ALL Dolphins Are Intelligent and SOME Are Friendly: Probing BERT for Nouns' Semantic Properties and their Prototypicality

Large scale language models encode rich commonsense knowledge acquired t...
research
05/16/2021

How is BERT surprised? Layerwise detection of linguistic anomalies

Transformer language models have shown remarkable ability in detecting w...
research
05/22/2023

GATology for Linguistics: What Syntactic Dependencies It Knows

Graph Attention Network (GAT) is a graph neural network which is one of ...
research
05/24/2020

Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers

Following the major success of neural language models (LMs) such as BERT...
research
03/09/2021

BERTese: Learning to Speak to BERT

Large pre-trained language models have been shown to encode large amount...
research
10/12/2022

Probing Commonsense Knowledge in Pre-trained Language Models with Sense-level Precision and Expanded Vocabulary

Progress on commonsense reasoning is usually measured from performance i...

Please sign up or login with your details

Forgot password? Click here to reset