Word sense induction using word embeddings and community detection in complex networks

03/22/2018
by   Edilson A. Corrêa Jr, et al.
0

Word Sense Induction (WSI) is the ability to automatically induce word senses from corpora. The WSI task was first proposed to overcome the limitations of manually annotated corpus that are required in word sense disambiguation systems. Even though several works have been proposed to induce word senses, existing systems are still very limited in the sense that they make use of structured, domain-specific knowledge sources. In this paper, we devise a method that leverages recent findings in word embeddings research to generate context embeddings, which are embeddings containing information about the semantical context of a word. In order to induce senses, we modeled the set of ambiguous words as a complex network. In the generated network, two instances (nodes) are connected if the respective context embeddings are similar. Upon using well-established community detection methods to cluster the obtained context embeddings, we found that the proposed method yields excellent performance for the WSI task. Our method outperformed competing algorithms and baselines, in a completely unsupervised manner and without the need of any additional structured knowledge source.

READ FULL TEXT
research
08/10/2017

Making Sense of Word Embeddings

We present a simple yet effective approach for learning word sense embed...
research
10/14/2021

Large Scale Substitution-based Word Sense Induction

We present a word-sense induction method based on pre-trained masked lan...
research
03/14/2020

Word Sense Disambiguation for 158 Languages using Word Embeddings Only

Disambiguation of word senses in context is easy for humans, but is a ma...
research
05/06/2018

Russian word sense induction by clustering averaged word embeddings

The paper reports our participation in the shared task on word sense ind...
research
02/01/2018

Adapting predominant and novel sense discovery algorithms for identifying corpus-specific sense differences

Word senses are not static and may have temporal, spatial or corpus-spec...
research
10/03/2019

Complex networks based word embeddings

Most of the time, the first step to learn word embeddings is to build a ...
research
05/27/2021

Semantic Frame Induction using Masked Word Embeddings and Two-Step Clustering

Recent studies on semantic frame induction show that relatively high per...

Please sign up or login with your details

Forgot password? Click here to reset