Knowledge-based Word Sense Disambiguation using Topic Models

01/05/2018
by   Devendra Singh Chaplot, et al.
0

Word Sense Disambiguation is an open problem in Natural Language Processing which is particularly challenging and useful in the unsupervised setting where all the words in any given text need to be disambiguated without using any labeled data. Typically WSD systems use the sentence or a small window of words around the target word as the context for disambiguation because their computational complexity scales exponentially with the size of the context. In this paper, we leverage the formalism of topic model to design a WSD system that scales linearly with the number of words in the context. As a result, our system is able to utilize the whole document as the context for a word to be disambiguated. The proposed method is a variant of Latent Dirichlet Allocation in which the topic proportions for a document are replaced by synset proportions. We further utilize the information in the WordNet by assigning a non-uniform prior to synset distribution over words and a logistic-normal prior for document distribution over synsets. We evaluate the proposed method on Senseval-2, Senseval-3, SemEval-2007, SemEval-2013 and SemEval-2015 English All-Word WSD datasets and show that it outperforms the state-of-the-art unsupervised knowledge-based WSD system by a significant margin.

READ FULL TEXT
research
11/11/2019

Word Sense Disambiguation using Knowledge-based Word Similarity

In natural language processing, word-sense disambiguation (WSD) is an op...
research
07/25/2017

ShotgunWSD: An unsupervised algorithm for global word sense disambiguation inspired by DNA sequencing

In this paper, we present a novel unsupervised algorithm for word sense ...
research
01/08/2021

A Novel Word Sense Disambiguation Approach Using WordNet Knowledge Graph

Various applications in computational linguistics and artificial intelli...
research
03/15/2013

Topic Discovery through Data Dependent and Random Projections

We present algorithms for topic modeling based on the geometry of cross-...
research
09/15/2018

Document Informed Neural Autoregressive Topic Models with Distributional Prior

We address two challenges in topic models: (1) Context information aroun...
research
01/18/2022

Unsupervised Multimodal Word Discovery based on Double Articulation Analysis with Co-occurrence cues

Human infants acquire their verbal lexicon from minimal prior knowledge ...
research
08/11/2018

Document Informed Neural Autoregressive Topic Models

Context information around words helps in determining their actual meani...

Please sign up or login with your details

Forgot password? Click here to reset