MedMentions: A Large Biomedical Corpus Annotated with UMLS Concepts

02/25/2019
by   Sunil Mohan, et al.
0

This paper presents the formal release of MedMentions, a new manually annotated resource for the recognition of biomedical concepts. What distinguishes MedMentions from other annotated biomedical corpora is its size (over 4,000 abstracts and over 350,000 linked mentions), as well as the size of the concept ontology (over 3 million concepts from UMLS 2017) and its broad coverage of biomedical disciplines. In addition to the full corpus, a sub-corpus of MedMentions is also presented, comprising annotations for a subset of UMLS 2017 targeted towards document retrieval. To encourage research in Biomedical Named Entity Recognition and Linking, data splits for training and testing are included in the release, and a baseline model and its metrics for entity linking are also described.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2017

Creation of an Annotated Corpus of Spanish Radiology Reports

This paper presents a new annotated corpus of 513 anonymized radiology r...
research
01/26/2021

Low Resource Recognition and Linking of Biomedical Concepts from a Large Ontology

Tools to explore scientific literature are essential for scientists, esp...
research
11/21/2019

LATTE: Latent Type Modeling for Biomedical Entity Linking

Entity linking is the task of linking mentions of named entities in natu...
research
10/24/2020

Disease Normalization with Graph Embeddings

The detection and normalization of diseases in biomedical texts are key ...
research
09/17/2020

PhenoTagger: A Hybrid Method for Phenotype Concept Recognition using Human Phenotype Ontology

Automatic phenotype concept recognition from unstructured text remains a...
research
11/12/2018

Bio-YODIE: A Named Entity Linking System for Biomedical Text

Ever-expanding volumes of biomedical text require automated semantic ann...
research
10/11/2022

Aggregating Crowdsourced and Automatic Judgments to Scale Up a Corpus of Anaphoric Reference for Fiction and Wikipedia Texts

Although several datasets annotated for anaphoric reference/coreference ...

Please sign up or login with your details

Forgot password? Click here to reset