Detecting Reliable Novel Word Senses: A Network-Centric Approach

12/14/2018
by   Abhik Jana, et al.
0

In this era of Big Data, due to expeditious exchange of information on the web, words are being used to denote newer meanings, causing linguistic shift. With the recent availability of large amounts of digitized texts, an automated analysis of the evolution of language has become possible. Our study mainly focuses on improving the detection of new word senses. This paper presents a unique proposal based on network features to improve the precision of new word sense detection. For a candidate word where a new sense (birth) has been detected by comparing the sense clusters induced at two different time points, we further compare the network properties of the subgraphs induced from novel sense cluster across these two time points. Using the mean fractional change in edge density, structural similarity and average path length as features in an SVM classifier, manual evaluation gives precision values of 0.86 and 0.74 for the task of new sense detection, when tested on 2 distinct time-point pairs, in comparison to the precision values in the range of 0.23-0.32, when the proposed scheme is not used. The outlined method can therefore be used as a new post-hoc step to improve the precision of novel word sense detection in a robust and reliable way where the underlying framework uses a graph structure. Another important observation is that even though our proposal is a post-hoc step, it can be used in isolation and that itself results in a very decent performance achieving a precision of 0.54-0.62. Finally, we show that our method is able to detect the well-known historical shifts in 80

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2014

That's sick dude!: Automatic identification of word sense change across different timescales

In this paper, we propose an unsupervised method to identify noun sense ...
research
01/06/2017

Real Multi-Sense or Pseudo Multi-Sense: An Approach to Improve Word Representation

Previous researches have shown that learning multiple representations fo...
research
09/26/2017

Polysemy Detection in Distributed Representation of Word Sense

In this paper, we propose a statistical test to determine whether a give...
research
11/27/2021

Language models in word sense disambiguation for Polish

In the paper, we test two different approaches to the unsupervised word ...
research
03/30/2016

Bilingual Learning of Multi-sense Embeddings with Discrete Autoencoders

We present an approach to learning multi-sense word embeddings relying b...
research
03/18/2022

SCoT: Sense Clustering over Time: a tool for the analysis of lexical change

We present Sense Clustering over Time (SCoT), a novel network-based tool...
research
07/25/2017

ShotgunWSD: An unsupervised algorithm for global word sense disambiguation inspired by DNA sequencing

In this paper, we present a novel unsupervised algorithm for word sense ...

Please sign up or login with your details

Forgot password? Click here to reset