Implanting Rational Knowledge into Distributed Representation at Morpheme Level

11/26/2018
by   Zi Lin, et al.
0

Previously, researchers paid no attention to the creation of unambiguous morpheme embeddings independent from the corpus, while such information plays an important role in expressing the exact meanings of words for parataxis languages like Chinese. In this paper, after constructing the Chinese lexical and semantic ontology based on word-formation, we propose a novel approach to implanting the structured rational knowledge into distributed representation at morpheme level, naturally avoiding heavy disambiguation in the corpus. We design a template to create the instances as pseudo-sentences merely from the pieces of knowledge of morphemes built in the lexicon. To exploit hierarchical information and tackle the data sparseness problem, the instance proliferation technique is applied based on similarity to expand the collection of pseudo-sentences. The distributed representation for morphemes can then be trained on these pseudo-sentences using word2vec. For evaluation, we validate the paradigmatic and syntagmatic relations of morpheme embeddings, and apply the obtained embeddings to word similarity measurement, achieving significant improvements over the classical models by more than 5 Spearman scores or 8 percentage points, which shows very promising prospects for adoption of the new source of knowledge.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/14/2019

Integrating Lexical Knowledge in Word Embeddings using Sprinkling and Retrofitting

Neural network based word embeddings, such as Word2Vec and GloVe, are pu...
research
05/25/2022

Apport des ontologies pour le calcul de la similarité sémantique au sein d'un système de recommandation

Measurement of the semantic relatedness or likeness between terms, words...
research
02/15/2018

Calculating the similarity between words and sentences using a lexical database and corpus statistics

Calculating the semantic similarity between sentences is a long dealt pr...
research
07/20/2023

Improving Semantic Similarity Measure Within a Recommender System Based-on RDF Graphs

In today's era of information explosion, more users are becoming more re...
research
03/11/2022

When classifying grammatical role, BERT doesn't care about word order... except when it matters

Because meaning can often be inferred from lexical semantics alone, word...
research
05/12/2018

Analogical Reasoning on Chinese Morphological and Semantic Relations

Analogical reasoning is effective in capturing linguistic regularities. ...

Please sign up or login with your details

Forgot password? Click here to reset