BERT Has Uncommon Sense: Similarity Ranking for Word Sense BERTology

09/20/2021
by   Luke Gessler, et al.
0

An important question concerning contextualized word embedding (CWE) models like BERT is how well they can represent different word senses, especially those in the long tail of uncommon senses. Rather than build a WSD system as in previous work, we investigate contextualized embedding neighborhoods directly, formulating a query-by-example nearest neighbor retrieval task and examining ranking performance for words and senses in different frequency bands. In an evaluation on two English sense-annotated corpora, we find that several popular CWE models all outperform a random baseline even for proportionally rare senses, without explicit sense supervision. However, performance varies considerably even among models with similar architectures and pretraining regimes, with especially large differences for rare word senses, revealing that CWE models are not all created equal when it comes to approximating word senses in their native representations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/06/2020

Moving Down the Long Tail of Word Sense Disambiguation with Gloss-Informed Biencoders

A major obstacle in Word Sense Disambiguation (WSD) is that word senses ...
research
09/18/2019

Using BERT for Word Sense Disambiguation

Word Sense Disambiguation (WSD), which aims to identify the correct sens...
research
09/06/2019

To lemmatize or not to lemmatize: how word normalisation affects ELMo performance in word sense disambiguation

We critically evaluate the widespread assumption that deep learning NLP ...
research
05/14/2019

Sense Vocabulary Compression through the Semantic Knowledge of WordNet for Neural Word Sense Disambiguation

In this article, we tackle the issue of the limited quantity of manually...
research
06/07/2018

Probabilistic FastText for Multi-Sense Word Embeddings

We introduce Probabilistic FastText, a new model for word embeddings tha...
research
08/18/2022

Merchandise Recommendation for Retail Events with Word Embedding Weighted Tf-idf and Dynamic Query Expansion

To recommend relevant merchandises for seasonal retail events, we rely o...
research
11/02/2018

Improving the Coverage and the Generalization Ability of Neural Word Sense Disambiguation through Hypernymy and Hyponymy Relationships

In Word Sense Disambiguation (WSD), the predominant approach generally i...

Please sign up or login with your details

Forgot password? Click here to reset