Does He Wink or Does He Nod? A Challenging Benchmark for Evaluating Word Understanding of Language Models

02/06/2021
by   Lutfi Kerem Senel, et al.
0

Recent progress in pretraining language models on large corpora has resulted in large performance gains on many NLP tasks. These large models acquire linguistic knowledge during pretraining, which helps to improve performance on downstream tasks via fine-tuning. To assess what kind of knowledge is acquired, language models are commonly probed by querying them with `fill in the blank' style cloze questions. Existing probing datasets mainly focus on knowledge about relations between words and entities. We introduce WDLMPro (Word Definition Language Model Probing) to evaluate word understanding directly using dictionary definitions of words. In our experiments, three popular pretrained language models struggle to match words and their definitions. This indicates that they understand many words poorly and that our new probing task is a difficult challenge that could help guide research on LMs in the future.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/10/2020

When Do You Need Billions of Words of Pretraining Data?

NLP is currently dominated by general-purpose pretrained language models...
research
09/03/2019

Language Models as Knowledge Bases?

Recent progress in pretraining language models on large textual corpora ...
research
03/11/2022

CoDA21: Evaluating Language Understanding Capabilities of NLP Models With Context-Definition Alignment

Pretrained language models (PLMs) have achieved superhuman performance o...
research
03/02/2021

The Rediscovery Hypothesis: Language Models Need to Meet Linguistics

There is an ongoing debate in the NLP community whether modern language ...
research
09/16/2022

Negation, Coordination, and Quantifiers in Contextualized Language Models

With the success of contextualized language models, much research explor...
research
08/01/2016

A Neural Knowledge Language Model

Current language models have a significant limitation in the ability to ...
research
04/05/2021

What's the best place for an AI conference, Vancouver or ______: Why completing comparative questions is difficult

Although large neural language models (LMs) like BERT can be finetuned t...

Please sign up or login with your details

Forgot password? Click here to reset