Extracting UMLS Concepts from Medical Text Using General and Domain-Specific Deep Learning Models

10/03/2019
by   Kathleen C. Fraser, et al.
0

Entity recognition is a critical first step to a number of clinical NLP applications, such as entity linking and relation extraction. We present the first attempt to apply state-of-the-art entity recognition approaches on a newly released dataset, MedMentions. This dataset contains over 4000 biomedical abstracts, annotated for UMLS semantic types. In comparison to existing datasets, MedMentions contains a far greater number of entity types, and thus represents a more challenging but realistic scenario in a real-world setting. We explore a number of relevant dimensions, including the use of contextual versus non-contextual word embeddings, general versus domain-specific unsupervised pre-training, and different deep learning architectures. We contrast our results against the well-known i2b2 2010 entity recognition dataset, and propose a new method to combine general and domain-specific information. While producing a state-of-the-art result for the i2b2 2010 task (F1 = 0.90), our results on MedMentions are significantly lower (F1 = 0.63), suggesting there is still plenty of opportunity for improvement on this new data.

READ FULL TEXT
research
08/24/2016

Robust Named Entity Recognition in Idiosyncratic Domains

Named entity recognition often fails in idiosyncratic domains. That caus...
research
05/24/2023

Automated Refugee Case Analysis: An NLP Pipeline for Supporting Legal Practitioners

In this paper, we introduce an end-to-end pipeline for retrieving, proce...
research
10/20/2020

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

Contextual word embedding models, such as BioBERT and Bio_ClinicalBERT, ...
research
09/21/2018

CollaboNet: collaboration of deep neural networks for biomedical named entity recognition

Background: Finding biomedical named entities is one of the most essenti...
research
06/17/2021

IFCNet: A Benchmark Dataset for IFC Entity Classification

Enhancing interoperability and information exchange between domain-speci...
research
05/01/2020

MedType: Improving Medical Entity Linking with Semantic Type Prediction

Medical entity linking is the task of identifying and standardizing conc...

Please sign up or login with your details

Forgot password? Click here to reset