Calculated attributes of synonym sets

03/05/2018
by   Andrew Krizhanovsky, et al.
0

The goal of formalization, proposed in this paper, is to bring together, as near as possible, the theoretic linguistic problem of synonym conception and the computer linguistic methods based generally on empirical intuitive unjustified factors. Using the word vector representation we have proposed the geometric approach to mathematical modeling of synonym set (synset). The word embedding is based on the neural networks (Skip-gram, CBOW), developed and realized as word2vec program by T. Mikolov. The standard cosine similarity is used as the distance between word-vectors. Several geometric characteristics of the synset words are introduced: the interior of synset, the synset word rank and centrality. These notions are intended to select the most significant synset words, i.e. the words which senses are the nearest to the sense of a synset. Some experiments with proposed notions, based on RusVectores resources, are represented. A brief description of this work can be viewed in slides https://goo.gl/K82Fei

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/21/2016

Bayesian Neural Word Embedding

Recently, several works in the domain of natural language processing pre...
research
11/03/2019

Low-dimensional Semantic Space: from Text to Word Embedding

This article focuses on the study of Word Embedding, a feature-learning ...
research
05/24/2018

WSD-algorithm based on new method of vector-word contexts proximity calculation via epsilon-filtration

The problem of word sense disambiguation (WSD) is considered in the arti...
research
05/24/2018

WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration

The problem of word sense disambiguation (WSD) is considered in the arti...
research
12/11/2019

Character 3-gram Mover's Distance: An Effective Method for Detecting Near-duplicate Japanese-language Recipes

In websites that collect user-generated recipes, recipes are often poste...
research
11/19/2015

Gaussian Mixture Embeddings for Multiple Word Prototypes

Recently, word representation has been increasingly focused on for its e...
research
10/03/2016

Grounding the Lexical Sets of Causative-Inchoative Verbs with Word Embedding

Lexical sets contain the words filling the argument positions of a verb ...

Please sign up or login with your details

Forgot password? Click here to reset