Inspecting the concept knowledge graph encoded by modern language models

05/27/2021
by   Carlos Aspillaga, et al.
0

The field of natural language understanding has experienced exponential progress in the last few years, with impressive results in several tasks. This success has motivated researchers to study the underlying knowledge encoded by these models. Despite this, attempts to understand their semantic capabilities have not been successful, often leading to non-conclusive, or contradictory conclusions among different works. Via a probing classifier, we extract the underlying knowledge graph of nine of the most influential language models of the last years, including word embeddings, text generators, and context encoders. This probe is based on concept relatedness, grounded on WordNet. Our results reveal that all the models encode this knowledge, but suffer from several inaccuracies. Furthermore, we show that the different architectures and training strategies lead to different model biases. We conduct a systematic evaluation to discover specific factors that explain why some concepts are challenging. We hope our insights will motivate the development of models that capture concepts more precisely.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2023

Knowledge Graph Guided Semantic Evaluation of Language Models For User Trust

A fundamental question in natural language processing is - what kind of ...
research
08/29/2023

Large language models converge toward human-like concept organization

Large language models show human-like performance in knowledge extractio...
research
12/05/2019

Measuring Social Bias in Knowledge Graph Embeddings

It has recently been shown that word embeddings encode social biases, wi...
research
11/08/2022

SocioProbe: What, When, and Where Language Models Learn about Sociodemographics

Pre-trained language models (PLMs) have outperformed other NLP models on...
research
04/25/2022

Incorporating Explicit Knowledge in Pre-trained Language Models for Passage Re-ranking

Passage re-ranking is to obtain a permutation over the candidate passage...
research
11/18/2022

Context Variance Evaluation of Pretrained Language Models for Prompt-based Biomedical Knowledge Probing

Pretrained language models (PLMs) have motivated research on what kinds ...
research
03/23/2022

Can Prompt Probe Pretrained Language Models? Understanding the Invisible Risks from a Causal View

Prompt-based probing has been widely used in evaluating the abilities of...

Please sign up or login with your details

Forgot password? Click here to reset