Knowledge-Base Enriched Word Embeddings for Biomedical Domain

by   Kishlay Jha, et al.

Word embeddings have been shown adept at capturing the semantic and syntactic regularities of the natural language text, as a result of which these representations have found their utility in a wide variety of downstream content analysis tasks. Commonly, these word embedding techniques derive the distributed representation of words based on the local context information. However, such approaches ignore the rich amount of explicit information present in knowledge-bases. This is problematic, as it might lead to poor representation for words with insufficient local context such as domain specific words. Furthermore, the problem becomes pronounced in domain such as bio-medicine where the presence of these domain specific words are relatively high. Towards this end, in this project, we propose a new word embedding based model for biomedical domain that jointly leverages the information from available corpora and domain knowledge in order to generate knowledge-base powered embeddings. Unlike existing approaches, the proposed methodology is simple but adept at capturing the precise knowledge available in domain resources in an accurate way. Experimental results on biomedical concept similarity and relatedness task validates the effectiveness of the proposed approach.



There are no comments yet.



Big Data Small Data, In Domain Out-of Domain, Known Word Unknown Word: The Impact of Word Representation on Sequence Labelling Tasks

Word embeddings -- distributed word representations that can be learned ...

A Comparison of Word Embeddings for the Biomedical Natural Language Processing

Neural word embeddings have been widely used in biomedical Natural Langu...

Learning Domain-Specific Word Embeddings from Sparse Cybersecurity Texts

Word embedding is a Natural Language Processing (NLP) technique that aut...

Insights into Analogy Completion from the Biomedical Domain

Analogy completion has been a popular task in recent years for evaluatin...

An Optimality Proof for the PairDiff operator for Representing Relations between Words

Representing the semantic relations that exist between two given words (...

A Simple Disaster-Related Knowledge Base for Intelligent Agents

In this paper, we describe our efforts in establishing a simple knowledg...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.