Comprehensive Named Entity Recognition on CORD-19 with Distant or Weak Supervision

03/27/2020
by   Xuan Wang, et al.
0

We created this CORD-19-NER dataset with comprehensive named entity recognition (NER) on the COVID-19 Open Research Dataset Challenge (CORD-19) corpus (2020- 03-13). This CORD-19-NER dataset covers 74 fine-grained named entity types. It is automatically generated by combining the annotation results from four sources: (1) pre-trained NER model on 18 general entity types from Spacy, (2) pre-trained NER model on 18 biomedical entity types from SciSpacy, (3) knowledge base (KB)-guided NER model on 127 biomedical entity types with our distantly-supervised NER method, and (4) seed-guided NER model on 8 new entity types (specifically related to the COVID-19 studies) with our weakly-supervised NER method. We hope this dataset can help the text mining community build downstream applications. We also hope this dataset can bring insights for the COVID- 19 studies, both on the biomedical side and on the social side.

READ FULL TEXT

page 3

page 4

research
06/04/2019

NNE: A Dataset for Nested Named Entity Recognition in English Newswire

Named entity recognition (NER) is widely used in natural language proces...
research
06/01/2019

Biomedical Named Entity Recognition via Reference-Set Augmented Bootstrapping

We present a weakly-supervised data augmentation approach to improve Nam...
research
01/06/2022

BERN2: an advanced neural biomedical named entity recognition and normalization tool

In biomedical natural language processing, named entity recognition (NER...
research
05/22/2023

Partial Annotation Learning for Biomedical Entity Recognition

Motivation: Named Entity Recognition (NER) is a key task to support biom...
research
06/27/2023

DMNER: Biomedical Entity Recognition by Detection and Matching

Biomedical named entity recognition (BNER) serves as the foundation for ...
research
11/10/2019

Knowledge Guided Named Entity Recognition

In this work, we try to perform Named Entity Recognition (NER) with exte...
research
05/18/2020

A Semantically Enriched Dataset based on Biomedical NER for the COVID19 Open Research Dataset Challenge

Research into COVID-19 is a big challenge and highly relevant at the mom...

Please sign up or login with your details

Forgot password? Click here to reset