Vector Embedding of Wikipedia Concepts and Entities

02/12/2017
by   Ehsan Sherkat, et al.
0

Using deep learning for different machine learning tasks such as image classification and word embedding has recently gained many attentions. Its appealing performance reported across specific Natural Language Processing (NLP) tasks in comparison with other approaches is the reason for its popularity. Word embedding is the task of mapping words or phrases to a low dimensional numerical vector. In this paper, we use deep learning to embed Wikipedia Concepts and Entities. The English version of Wikipedia contains more than five million pages, which suggest its capability to cover many English Entities, Phrases, and Concepts. Each Wikipedia page is considered as a concept. Some concepts correspond to entities, such as a person's name, an organization or a place. Contrary to word embedding, Wikipedia Concepts Embedding is not ambiguous, so there are different vectors for concepts with similar surface form but different mentions. We proposed several approaches and evaluated their performance based on Concept Analogy and Concept Similarity tasks. The results show that proposed approaches have the performance comparable and in some cases even higher than the state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2020

Concept Embedding for Information Retrieval

Concepts are used to solve the term-mismatch problem. However, we need a...
research
02/22/2017

EVE: Explainable Vector Based Embedding Technique Using Wikipedia

We present an unsupervised explainable word embedding technique, called ...
research
06/24/2016

Evaluation method of word embedding by roots and affixes

Word embedding has been shown to be remarkably effective in a lot of Nat...
research
08/02/2016

Semantic Representations of Word Senses and Concepts

Representing the semantics of linguistic items in a machine-interpretabl...
research
10/24/2018

Clinical Concept Extraction with Contextual Word Embedding

Automatic extraction of clinical concepts is an essential step for turni...
research
01/25/2020

An Analysis of Word2Vec for the Italian Language

Word representation is fundamental in NLP tasks, because it is precisely...
research
05/23/2023

Accessing Higher Dimensions for Unsupervised Word Translation

The striking ability of unsupervised word translation has been demonstra...

Please sign up or login with your details

Forgot password? Click here to reset