BERT-based Ranking for Biomedical Entity Normalization

08/09/2019
by   Zongcheng Ji, et al.
0

Developing high-performance entity normalization algorithms that can alleviate the term variation problem is of great interest to the biomedical community. Although deep learning-based methods have been successfully applied to biomedical entity normalization, they often depend on traditional context-independent word embeddings. Bidirectional Encoder Representations from Transformers (BERT), BERT for Biomedical Text Mining (BioBERT) and BERT for Clinical Text Mining (ClinicalBERT) were recently introduced to pre-train contextualized word representation models using bidirectional Transformers, advancing the state-of-the-art for many natural language processing tasks. In this study, we proposed an entity normalization architecture by fine-tuning the pre-trained BERT / BioBERT / ClinicalBERT models and conducted extensive experiments to evaluate the effectiveness of the pre-trained models for biomedical entity normalization using three different types of datasets. Our experimental results show that the best fine-tuned models consistently outperformed previous methods and advanced the state-of-the-art for biomedical entity normalization, with up to 1.17

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

01/25/2019

BioBERT: pre-trained biomedical language representation model for biomedical text mining

Biomedical text mining has become more important than ever as the number...
01/22/2021

Drug and Disease Interpretation Learning with Biomedical Entity Representation Transformer

Concept normalization in free-form texts is a crucial step in every text...
05/01/2020

Biomedical Entity Representations with Synonym Marginalization

Biomedical named entities often play important roles in many biomedical ...
05/14/2020

A pre-training technique to localize medical BERT and enhance BioBERT

Bidirectional Encoder Representations from Transformers (BERT) models fo...
09/10/2021

Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT

Infusing factual knowledge into pre-trained models is fundamental for ma...
12/02/2021

Unsupervised Law Article Mining based on Deep Pre-Trained Language Representation Models with Application to the Italian Civil Code

Modeling law search and retrieval as prediction problems has recently em...
12/22/2020

Improved Biomedical Word Embeddings in the Transformer Era

Biomedical word embeddings are usually pre-trained on free text corpora ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.