Improving Word Vector with Prior Knowledge in Semantic Dictionary

01/27/2018
by   Wei Li, et al.
0

Using low dimensional vector space to represent words has been very effective in many NLP tasks. However, it doesn't work well when faced with the problem of rare and unseen words. In this paper, we propose to leverage the knowledge in semantic dictionary in combination with some morphological information to build an enhanced vector space. We get an improvement of 2.3 state-of-the-art Heidel Time system in temporal expression recognition, and obtain a large gain in other name entity recognition (NER) tasks. The semantic dictionary Hownet alone also shows promising results in computing lexical similarity.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

02/23/2017

LTSG: Latent Topical Skip-Gram for Mutually Learning Topic Model and Vector Representations

Topic models have been widely used in discovering latent topics which ar...
12/19/2019

RIMAX: Ranking Semantic Rhymes by calculating Definition Similarity

This paper presents RIMAX, a new system for detecting semantic rhymes, u...
01/16/2020

Lexical Sememe Prediction using Dictionary Definitions by Capturing Local Semantic Correspondence

Sememes, defined as the minimum semantic units of human languages in lin...
04/06/2020

Building a Norwegian Lexical Resource for Medical Entity Recognition

We present a large Norwegian lexical resource of categorized medical ter...
07/01/2020

Build2Vec: Building Representation in Vector Space

In this paper, we represent a methodology of a graph embeddings algorith...
03/05/2017

Random vector generation of a semantic space

We show how random vectors and random projection can be implemented in t...
09/01/2020

Document Similarity from Vector Space Densities

We propose a computationally light method for estimating similarities be...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.