Knowledge-Based Biomedical Word Sense Disambiguation with Neural Concept Embeddings

10/26/2016
by   A. K. M. Sabbir, et al.
0

Biomedical word sense disambiguation (WSD) is an important intermediate task in many natural language processing applications such as named entity recognition, syntactic parsing, and relation extraction. In this paper, we employ knowledge-based approaches that also exploit recent advances in neural word/concept embeddings to improve over the state-of-the-art in biomedical WSD using the MSH WSD dataset as the test set. Our methods involve weak supervision - we do not use any hand-labeled examples for WSD to build our prediction models; however, we employ an existing well known named entity recognition and concept mapping program, MetaMap, to obtain our concept vectors. Over the MSH WSD dataset, our linear time (in terms of numbers of senses and words in the test instance) method achieves an accuracy of 92.24 improvement over the best known results obtained via unsupervised or knowledge-based means. A more expensive approach that we developed relies on a nearest neighbor framework and achieves an accuracy of 94.34 vector representations learned from unlabeled free text has been shown to benefit many language processing tasks recently and our efforts show that biomedical WSD is no exception to this trend. For a complex and rapidly evolving domain such as biomedicine, building labeled datasets for larger sets of ambiguous terms may be impractical. Here, we show that weak supervision that leverages recent advances in representation learning can rival supervised approaches in biomedical WSD. However, external knowledge bases (here sense inventories) play a key role in the improvements achieved.

READ FULL TEXT
research
04/20/2017

SwellShark: A Generative Model for Biomedical Named Entity Recognition without Labeled Data

We present SwellShark, a framework for building biomedical named entity ...
research
04/21/2021

Improving Biomedical Pretrained Language Models with Knowledge

Pretrained language models have shown success in many natural language p...
research
01/01/2021

How Do Your Biomedical Named Entity Models Generalize to Novel Entities?

The number of biomedical literature on new biomedical concepts is rapidl...
research
05/22/2023

Biomedical Named Entity Recognition via Dictionary-based Synonym Generalization

Biomedical named entity recognition is one of the core tasks in biomedic...
research
09/22/2018

A Byte-sized Approach to Named Entity Recognition

In biomedical literature, it is common for entity boundaries to not alig...
research
10/01/2019

Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations

Contextualized word representations are able to give different represent...
research
01/27/2022

Epistemic AI platform accelerates innovation by connecting biomedical knowledge

Epistemic AI accelerates biomedical discovery by finding hidden connecti...

Please sign up or login with your details

Forgot password? Click here to reset