The Analysis about Building Cross-lingual Sememe Knowledge Base Based on Deep Clustering Network

08/10/2022
by   Xiaoran Li, et al.
0

A sememe is defined as the minimum semantic unit of human languages. Sememe knowledge bases (KBs), which contain words annotated with sememes, have been successfully applied to many NLP tasks, and we believe that by learning the smallest unit of meaning, computers can more easily understand human language. However, Existing sememe KBs are built on only manual annotation, human annotations have personal understanding biases, and the meaning of vocabulary will be constantly updated and changed with the times, and artificial methods are not always practical. To address the issue, we propose an unsupervised method based on a deep clustering network (DCN) to build a sememe KB, and you can use any language to build a KB through this method. We first learn the distributed representation of multilingual words, use MUSE to align them in a single vector space, learn the multi-layer meaning of each word through the self-attention mechanism, and use a DNC to cluster sememe features. Finally, we completed the prediction using only the 10-dimensional sememe space in English. We found that the low-dimensional space can still retain the main feature of the sememes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/04/2019

Towards Building a Multilingual Sememe Knowledge Base: Predicting Sememes for BabelNet Synsets

A sememe is defined as the minimum semantic unit of human languages. Sem...
research
03/14/2022

Sememe Prediction for BabelNet Synsets using Multilingual and Multimodal Information

In linguistics, a sememe is defined as the minimum semantic unit of lang...
research
06/17/2018

Incorporating Chinese Characters of Words for Lexical Sememe Prediction

Sememes are minimum semantic units of concepts in human languages, such ...
research
07/21/2017

Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation

Existing approaches to automatic VerbNet-style verb classification are h...
research
04/17/2021

AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages with Adversarial Examples

Capturing word meaning in context and distinguishing between corresponde...
research
08/16/2018

Sememe Prediction: Learning Semantic Knowledge from Unstructured Textual Wiki Descriptions

Huge numbers of new words emerge every day, leading to a great need for ...
research
05/26/2021

Automatic Construction of Sememe Knowledge Bases via Dictionaries

A sememe is defined as the minimum semantic unit in linguistics. Sememe ...

Please sign up or login with your details

Forgot password? Click here to reset