DeepAI AI Chat
Log In Sign Up

Efficient Graph-based Word Sense Induction by Distributional Inclusion Vector Embeddings

by   Haw-Shiuan Chang, et al.

Word sense induction (WSI), which addresses polysemy by unsupervised discovery of multiple word senses, resolves ambiguities for downstream NLP tasks and also makes word representations more interpretable. This paper proposes an accurate and efficient graph-based method for WSI that builds a global non-negative vector embedding basis (which are interpretable like topics) and clusters the basis indexes in the ego network of each polysemous word. By adopting distributional inclusion vector embeddings as our basis formation model, we avoid the expensive step of nearest neighbor search that plagues other graph-based methods without sacrificing the quality of sense clusters. Experiments on three datasets show that our proposed method produces similar or better sense clusters and embeddings compared with previous state-of-the-art methods while being significantly more efficient.


page 4

page 5


Sense Embedding Learning for Word Sense Induction

Conventional word sense induction (WSI) methods usually represent each i...

Unsupervised Hypernym Detection by Distributional Inclusion Vector Embedding

Modeling hypernymy, such as poodle is-a dog, is an important generalizat...

Inducing and Embedding Senses with Scaled Gumbel Softmax

Methods for learning word sense embeddings represent a single word with ...

Russian word sense induction by clustering averaged word embeddings

The paper reports our participation in the shared task on word sense ind...

Unsupervised, Knowledge-Free, and Interpretable Word Sense Disambiguation

Interpretability of a predictive model is a powerful feature that gains ...

RuDSI: graph-based word sense induction dataset for Russian

We present RuDSI, a new benchmark for word sense induction (WSI) in Russ...

An Analysis of Euclidean vs. Graph-Based Framing for Bilingual Lexicon Induction from Word Embedding Spaces

Much recent work in bilingual lexicon induction (BLI) views word embeddi...