A multi-perspective combined recall and rank framework for Chinese procedure terminology normalization

by   Ming Liang, et al.

Medical terminology normalization aims to map the clinical mention to terminologies come from a knowledge base, which plays an important role in analyzing Electronic Health Record(EHR) and many downstream tasks. In this paper, we focus on Chinese procedure terminology normalization. The expression of terminologies are various and one medical mention may be linked to multiple terminologies. Previous study explores some methods such as multi-class classification or learning to rank(LTR) to sort the terminologies by literature and semantic information. However, these information is inadequate to find the right terminologies, particularly in multi-implication cases. In this work, we propose a combined recall and rank framework to solve the above problems. This framework is composed of a multi-task candidate generator(MTCG), a keywords attentive ranker(KAR) and a fusion block(FB). MTCG is utilized to predict the mention implication number and recall candidates with semantic similarity. KAR is based on Bert with a keywords attentive mechanism which focuses on keywords such as procedure sites and procedure types. FB merges the similarity come from MTCG and KAR to sort the terminologies from different perspectives. Detailed experimental analysis shows our proposed framework has a remarkable improvement on both performance and efficiency.


BERT Busters: Outlier LayerNorm Dimensions that Disrupt BERT

Multiple studies have shown that BERT is remarkably robust to pruning, y...

End-to-end Clinical Event Extraction from Chinese Electronic Health Record

Event extraction is an important work of medical text processing. Accord...

Medical Entity Linking using Triplet Network

Entity linking (or Normalization) is an essential task in text mining th...

Knowledge-Empowered Representation Learning for Chinese Medical Reading Comprehension: Task, Model and Resources

Machine Reading Comprehension (MRC) aims to extract answers to questions...

CODER: Knowledge infused cross-lingual medical term embedding for term normalization

We propose a novel medical term embedding method named CODER, which stan...

MoNoise: Modeling Noise Using a Modular Normalization System

We propose MoNoise: a normalization model focused on generalizability an...

DSR: A Collection for the Evaluation of Graded Disease-Symptom Relations

The effective extraction of ranked disease-symptom relationships is a cr...