A Hybrid Approach to Measure Semantic Relatedness in Biomedical Concepts

Objective: This work aimed to demonstrate the effectiveness of a hybrid approach based on Sentence BERT model and retrofitting algorithm to compute relatedness between any two biomedical concepts. Materials and Methods: We generated concept vectors by encoding concept preferred terms using ELMo, BERT, and Sentence BERT models. We used BioELMo and Clinical ELMo. We used Ontology Knowledge Free (OKF) models like PubMedBERT, BioBERT, BioClinicalBERT, and Ontology Knowledge Injected (OKI) models like SapBERT, CoderBERT, KbBERT, and UmlsBERT. We trained all the BERT models using Siamese network on SNLI and STSb datasets to allow the models to learn more semantic information at the phrase or sentence level so that they can represent multi-word concepts better. Finally, to inject ontology relationship knowledge into concept vectors, we used retrofitting algorithm and concepts from various UMLS relationships. We evaluated our hybrid approach on four publicly available datasets which also includes the recently released EHR-RelB dataset. EHR-RelB is the largest publicly available relatedness dataset in which 89 which makes it more challenging. Results: Sentence BERT models mostly outperformed corresponding BERT models. The concept vectors generated using the Sentence BERT model based on SapBERT and retrofitted using UMLS-related concepts achieved the best results on all four datasets. Conclusions: Sentence BERT models are more effective compared to BERT models in computing relatedness scores in most of the cases. Injecting ontology knowledge into concept vectors further enhances their quality and contributes to better relatedness scores.

READ FULL TEXT

page 13

page 14

research
04/14/2020

Multi-Ontology Refined Embeddings (MORE): A Hybrid Multi-Ontology and Corpus-based Semantic Representation for Biomedical Concepts

Objective: Currently, a major limitation for natural language processing...
research
10/28/2022

Feature Engineering vs BERT on Twitter Data

In this paper, we compare the performances of traditional machine learni...
research
01/01/2022

Semantic Search for Large Scale Clinical Ontologies

Finding concepts in large clinical ontologies can be challenging when qu...
research
05/16/2023

Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned Language Models

Learning vectors that capture the meaning of concepts remains a fundamen...
research
08/24/2023

Multi-BERT for Embeddings for Recommendation System

In this paper, we propose a novel approach for generating document embed...
research
07/04/2022

Using contextual sentence analysis models to recognize ESG concepts

This paper summarizes the joint participation of the Trading Central Lab...
research
11/22/2022

OLGA : An Ontology and LSTM-based approach for generating Arithmetic Word Problems (AWPs) of transfer type

Machine generation of Arithmetic Word Problems (AWPs) is challenging as ...

Please sign up or login with your details

Forgot password? Click here to reset