RAW-C: Relatedness of Ambiguous Words–in Context (A New Lexical Resource for English)

05/27/2021
by   Sean Trott, et al.
0

Most words are ambiguous–i.e., they convey distinct meanings in different contexts–and even the meanings of unambiguous words are context-dependent. Both phenomena present a challenge for NLP. Recently, the advent of contextualized word embeddings has led to success on tasks involving lexical ambiguity, such as Word Sense Disambiguation. However, there are few tasks that directly evaluate how well these contextualized embeddings accommodate the more continuous, dynamic nature of word meaning–particularly in a way that matches human intuitions. We introduce RAW-C, a dataset of graded, human relatedness judgments for 112 ambiguous words in context (with 672 sentence pairs total), as well as human estimates of sense dominance. The average inter-annotator agreement (assessed using a leave-one-annotator-out method) was 0.79. We then show that a measure of cosine distance, computed using contextualized embeddings from BERT and ELMo, correlates with human judgments, but that cosine distance also systematically underestimates how similar humans find uses of the same sense of a word to be, and systematically overestimates how similar humans find uses of different-sense homonyms. Finally, we propose a synthesis between psycholinguistic theories of the mental lexicon and computational models of lexical semantics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/27/2021

Patterns of Lexical Ambiguity in Contextualised Language Models

One of the central aspects of contextualised language models is that the...
research
03/10/2022

Contextualized Sensorimotor Norms: multi-dimensional measures of sensorimotor strength for ambiguous English words, in context

Most large language models are trained on linguistic input alone, yet hu...
research
08/20/2022

Lost in Context? On the Sense-wise Variance of Contextualized Word Embeddings

Contextualized word embeddings in language models have given much advanc...
research
10/05/2020

Speakers Fill Lexical Semantic Gaps with Context

Lexical ambiguity is widespread in language, allowing for the reuse of e...
research
11/21/2020

Sensing Ambiguity in Henry James' "The Turn of the Screw"

Fields such as the philosophy of language, continental philosophy, and l...
research
09/23/2021

Putting Words in BERT's Mouth: Navigating Contextualized Vector Spaces with Pseudowords

We present a method for exploring regions around individual points in a ...
research
05/16/2022

What company do words keep? Revisiting the distributional semantics of J.R. Firth Zellig Harris

The power of word embeddings is attributed to the linguistic theory that...

Please sign up or login with your details

Forgot password? Click here to reset