Incorporating Chinese Characters of Words for Lexical Sememe Prediction

06/17/2018
by   Huiming Jin, et al.
0

Sememes are minimum semantic units of concepts in human languages, such that each word sense is composed of one or multiple sememes. Words are usually manually annotated with their sememes by linguists, and form linguistic common-sense knowledge bases widely used in various NLP tasks. Recently, the lexical sememe prediction task has been introduced. It consists of automatically recommending sememes for words, which is expected to improve annotation efficiency and consistency. However, existing methods of lexical sememe prediction typically rely on the external context of words to represent the meaning, which usually fails to deal with low-frequency and out-of-vocabulary words. To address this issue for Chinese, we propose a novel framework to take advantage of both internal character information and external context information of words. We experiment on HowNet, a Chinese sememe knowledge base, and demonstrate that our framework outperforms state-of-the-art baselines by a large margin, and maintains a robust performance even for low-frequency words.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2018

Incorporating Glosses into Neural Word Sense Disambiguation

Word Sense Disambiguation (WSD) aims to identify the correct meaning of ...
research
08/31/2017

Glyph-aware Embedding of Chinese Characters

Given the advantage and recent success of English character-level and su...
research
08/16/2018

Sememe Prediction: Learning Semantic Knowledge from Unstructured Textual Wiki Descriptions

Huge numbers of new words emerge every day, leading to a great need for ...
research
08/10/2022

The Analysis about Building Cross-lingual Sememe Knowledge Base Based on Deep Clustering Network

A sememe is defined as the minimum semantic unit of human languages. Sem...
research
01/19/2020

Correcting Knowledge Base Assertions

The usefulness and usability of knowledge bases (KBs) is often limited b...
research
09/19/2020

Nominal Compound Chain Extraction: A New Task for Semantic-enriched Lexical Chain

Lexical chain consists of cohesion words in a document, which implies th...
research
08/14/2018

Primal Meaning Recommendation for Chinese Words and Phrases via Descriptions in On-line Encyclopedia

Polysemy is a very common phenomenon in modern languages. Most of previo...

Please sign up or login with your details

Forgot password? Click here to reset