Homonym Identification using BERT – Using a Clustering Approach

01/07/2021
by   Rohan Saha, et al.
3

Homonym identification is important for WSD that require coarse-grained partitions of senses. The goal of this project is to determine whether contextual information is sufficient for identifying a homonymous word. To capture the context, BERT embeddings are used as opposed to Word2Vec, which conflates senses into one vector. SemCor is leveraged to retrieve the embeddings. Various clustering algorithms are applied to the embeddings. Finally, the embeddings are visualized in a lower-dimensional space to understand the feasibility of the clustering process.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/12/2022

Using Paraphrases to Study Properties of Contextual Embeddings

We use paraphrases as a unique source of data to analyze contextualized ...
research
05/04/2020

Spying on your neighbors: Fine-grained probing of contextual embeddings for information about surrounding words

Although models using contextual word embeddings have achieved state-of-...
research
12/09/2020

Cross-lingual Word Sense Disambiguation using mBERT Embeddings with Syntactic Dependencies

Cross-lingual word sense disambiguation (WSD) tackles the challenge of d...
research
04/27/2023

Idioms, Probing and Dangerous Things: Towards Structural Probing for Idiomaticity in Vector Space

The goal of this paper is to learn more about how idiomatic information ...
research
06/06/2023

Systematic Analysis of Music Representations from BERT

There have been numerous attempts to represent raw data as numerical vec...
research
06/20/2020

Sarcasm Detection in Tweets with BERT and GloVe Embeddings

Sarcasm is a form of communication in whichthe person states opposite of...
research
10/21/2022

Probing with Noise: Unpicking the Warp and Weft of Embeddings

Improving our understanding of how information is encoded in vector spac...

Please sign up or login with your details

Forgot password? Click here to reset