Retrofitting Concept Vector Representations of Medical Concepts to Improve Estimates of Semantic Similarity and Relatedness

09/21/2017
by   Zhiguo Yu, et al.
0

Estimation of semantic similarity and relatedness between biomedical concepts has utility for many informatics applications. Automated methods fall into two categories: methods based on distributional statistics drawn from text corpora, and methods using the structure of existing knowledge resources. Methods in the former category disregard taxonomic structure, while those in the latter fail to consider semantically relevant empirical information. In this paper, we present a method that retrofits distributional context vector representations of biomedical concepts using structural information from the UMLS Metathesaurus, such that the similarity between vector representations of linked concepts is augmented. We evaluated it on the UMNSRS benchmark. Our results demonstrate that retrofitting of concept vector representations leads to better correlation with human raters for both similarity and relatedness, surpassing the best results reported to date. They also demonstrate a clear improvement in performance on this reference standard for retrofitted vector representations, as compared to those without retrofitting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2022

BioLORD: Learning Ontological Representations from Definitions (for Biomedical Concepts and their Textual Descriptions)

This work introduces BioLORD, a new pre-training strategy for producing ...
research
09/02/2016

Improving Correlation with Human Judgments by Integrating Semantic Similarity with Second--Order Vectors

Vector space methods that measure semantic similarity and relatedness of...
research
07/19/2016

A Novel Information Theoretic Framework for Finding Semantic Similarity in WordNet

Information content (IC) based measures for finding semantic similarity ...
research
10/15/2019

Context Matters: Recovering Human Semantic Structure from Machine Learning Analysis of Large-Scale Text Corpora

Understanding how human semantic knowledge is organized and how people u...
research
08/19/2016

Using Distributed Representations to Disambiguate Biomedical and Clinical Concepts

In this paper, we report a knowledge-based method for Word Sense Disambi...
research
02/08/2023

A Parametric Similarity Method: Comparative Experiments based on Semantically Annotated Large Datasets

We present the parametric method SemSimp aimed at measuring semantic sim...

Please sign up or login with your details

Forgot password? Click here to reset