Uncertainty in Neural Network Word Embedding: Exploration of Threshold for Similarity

06/20/2016
by   Navid Rekabsaz, et al.
0

Word embedding, specially with its recent developments, promises a quantification of the similarity between terms. However, it is not clear to which extent this similarity value can be genuinely meaningful and useful for subsequent tasks. We explore how the similarity score obtained from the models is really indicative of term relatedness. We first observe and quantify the uncertainty factor of the word embedding models regarding to the similarity value. Based on this factor, we introduce a general threshold on various dimensions which effectively filters the highly related terms. Our evaluation on four information retrieval collections supports the effectiveness of our approach as the results of the introduced threshold are significantly better than the baseline while being equal to or statistically indistinguishable from the optimal results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/11/2018

Enhancing Translation Language Models with Word Embedding for Information Retrieval

In this paper, we explore the usage of Word Embedding semantic resources...
research
05/09/2017

Relevance-based Word Embedding

Learning a high-dimensional dense representation for vocabulary terms, a...
research
04/01/2019

Syntactic Interchangeability in Word Embedding Models

Nearest neighbors in word embedding models are commonly observed to be s...
research
08/21/2018

Gaussian Word Embedding with a Wasserstein Distance Loss

Comparing with word embedding that based on the point representation, di...
research
12/22/2017

Novel Ranking-Based Lexical Similarity Measure for Word Embedding

Distributional semantics models derive word space from linguistic items ...
research
12/04/2018

Twitter-based traffic information system based on vector representations for words

Recently, researchers have shown an increased interest in harnessing Twi...
research
07/06/2019

TEALS: Time-aware Text Embedding Approach to Leverage Subgraphs

Given a graph over which the contagions (e.g. virus, gossip) propagate, ...

Please sign up or login with your details

Forgot password? Click here to reset