Automatic Biomedical Term Clustering by Learning Fine-grained Term Representations

04/01/2022
by   Sihang Zeng, et al.
0

Term clustering is important in biomedical knowledge graph construction. Using similarities between terms embedding is helpful for term clustering. State-of-the-art term embeddings leverage pretrained language models to encode terms, and use synonyms and relation knowledge from knowledge graphs to guide contrastive learning. These embeddings provide close embeddings for terms belonging to the same concept. However, from our probing experiments, these embeddings are not sensitive to minor textual differences which leads to failure for biomedical term clustering. To alleviate this problem, we adjust the sampling strategy in pretraining term embeddings by providing dynamic hard positive and negative samples during contrastive learning to learn fine-grained representations which result in better biomedical term clustering. We name our proposed method as CODER++, and it has been applied in clustering biomedical concepts in the newly released Biomedical Knowledge Graph named BIOS.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2023

Hierarchical Pretraining for Biomedical Term Embeddings

Electronic health records (EHR) contain narrative notes that provide ext...
research
05/09/2018

Adversarial Contrastive Estimation

Learning by contrasting positive and negative samples is a general strat...
research
11/05/2020

CODER: Knowledge infused cross-lingual medical term embedding for term normalization

We propose a novel medical term embedding method named CODER, which stan...
research
12/22/2018

TaxoGen: Unsupervised Topic Taxonomy Construction by Adaptive Term Embedding and Clustering

Taxonomy construction is not only a fundamental task for semantic analys...
research
10/05/2020

Temporal Positive-unlabeled Learning for Biomedical Hypothesis Generation via Risk Estimation

Understanding the relationships between biomedical terms like viruses, d...
research
03/18/2022

BIOS: An Algorithmically Generated Biomedical Knowledge Graph

Biomedical knowledge graphs (BioMedKGs) are essential infrastructures fo...

Please sign up or login with your details

Forgot password? Click here to reset