Word Sense Disambiguation using Diffusion Kernel PCA

07/21/2019
by   Bilge Sipal, et al.
0

One of the major problems in natural language processing (NLP) is the word sense disambiguation (WSD) problem. It is the task of computationally identifying the right sense of a polysemous word based on its context. Resolving the WSD problem boosts the accuracy of many NLP focused algorithms such as text classification and machine translation. In this paper, we introduce a new supervised algorithm for WSD, that is based on Kernel PCA and Semantic Diffusion Kernel, which is called Diffusion Kernel PCA (DKPCA). DKPCA grasps the semantic similarities within terms, and it is based on PCA. These properties enable us to perform feature extraction and dimension reduction guided by semantic similarities and within the algorithm. Our empirical results on SensEval data demonstrate that DKPCA achieves higher or very close accuracy results compared to SVM and KPCA with various well-known kernels when the labeled data ratio is meager. Considering the scarcity of labeled data, whereas large quantities of unlabeled textual data are easily accessible, these are highly encouraging first results to develop DKPCA further.

READ FULL TEXT
research
08/28/2018

KDSL: a Knowledge-Driven Supervised Learning Framework for Word Sense Disambiguation

We propose KDSL, a new word sense disambiguation (WSD) framework that ut...
research
02/27/2017

A Knowledge-Based Approach to Word Sense Disambiguation by distributional selection and semantic features

Word sense disambiguation improves many Natural Language Processing (NLP...
research
12/18/2020

Upper and Lower Bounds on the Performance of Kernel PCA

Principal Component Analysis (PCA) is a popular method for dimension red...
research
02/07/2018

Unsupervised word sense disambiguation in dynamic semantic spaces

In this paper, we are mainly concerned with the ability to quickly and a...
research
02/27/2017

Approches d'analyse distributionnelle pour améliorer la désambiguïsation sémantique

Word sense disambiguation (WSD) improves many Natural Language Processin...
research
05/17/2016

Word2Vec is a special case of Kernel Correspondence Analysis and Kernels for Natural Language Processing

We show Correspondence Analysis (CA) is equivalent to defining Gini-inde...
research
11/10/2017

Efficient Representation for Natural Language Processing via Kernelized Hashcodes

Kernel similarity functions have been successfully applied in classifica...

Please sign up or login with your details

Forgot password? Click here to reset