Distributional semantic modeling: a revised technique to train term/word vector space models applying the ontology-related approach

by   Oleksandr Palagin, et al.

We design a new technique for the distributional semantic modeling with a neural network-based approach to learn distributed term representations (or term embeddings) - term vector space models as a result, inspired by the recent ontology-related approach (using different types of contextual knowledge such as syntactic knowledge, terminological knowledge, semantic knowledge, etc.) to the identification of terms (term extraction) and relations between them (relation extraction) called semantic pre-processing technology - SPT. Our method relies on automatic term extraction from the natural language texts and subsequent formation of the problem-oriented or application-oriented (also deeply annotated) text corpora where the fundamental entity is the term (includes non-compositional and compositional terms). This gives us an opportunity to changeover from distributed word representations (or word embeddings) to distributed term representations (or term embeddings). This transition will allow to generate more accurate semantic maps of different subject domains (also, of relations between input terms - it is useful to explore clusters and oppositions, or to test your hypotheses about them). The semantic map can be represented as a graph using Vec2graph - a Python library for visualizing word embeddings (term embeddings in our case) as dynamic and interactive graphs. The Vec2graph library coupled with term embeddings will not only improve accuracy in solving standard NLP tasks, but also update the conventional concept of automated ontology development. The main practical result of our work is the development kit (set of toolkits represented as web service APIs and web application), which provides all necessary routines for the basic linguistic pre-processing and the semantic pre-processing of the natural language texts in Ukrainian for future training of term vector space models.


page 1

page 2

page 3

page 4


Affordance Extraction and Inference based on Semantic Role Labeling

Common-sense reasoning is becoming increasingly important for the advanc...

Analyzing Structures in the Semantic Vector Space: A Framework for Decomposing Word Embeddings

Word embeddings are rich word representations, which in combination with...

Reverse Transfer Learning: Can Word Embeddings Trained for Different NLP Tasks Improve Neural Language Models?

Natural language processing (NLP) tasks tend to suffer from a paucity of...

Bag-of-Vector Embeddings of Dependency Graphs for Semantic Induction

Vector-space models, from word embeddings to neural network parsers, hav...

Skill2vec: Machine Learning Approaches for Determining the Relevant Skill from Job Description

Un-supervise learned word embeddings have seen tremendous success in num...

Monitoring Term Drift Based on Semantic Consistency in an Evolving Vector Field

Based on the Aristotelian concept of potentiality vs. actuality allowing...

Synonym Detection Using Syntactic Dependency And Neural Embeddings

Recent advances on the Vector Space Model have significantly improved so...

Please sign up or login with your details

Forgot password? Click here to reset