DeepAI AI Chat
Log In Sign Up

Biomedical Named Entity Recognition via Reference-Set Augmented Bootstrapping

06/01/2019
by   Joel Mathew, et al.
0

We present a weakly-supervised data augmentation approach to improve Named Entity Recognition (NER) in a challenging domain: extracting biomedical entities (e.g., proteins) from the scientific literature. First, we train a neural NER (NNER) model over a small seed of fully-labeled examples. Second, we use a reference set of entity names (e.g., proteins in UniProt) to identify entity mentions with high precision, but low recall, on an unlabeled corpus. Third, we use the NNER model to assign weak labels to the corpus. Finally, we retrain our NNER model iteratively over the augmented training set, including the seed, the reference-set examples, and the weakly-labeled examples, which improves model performance. We show empirically that this augmented bootstrapping process significantly improves NER performance, and discuss the factors impacting the efficacy of the approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/27/2020

Comprehensive Named Entity Recognition on CORD-19 with Distant or Weak Supervision

We created this CORD-19-NER dataset with comprehensive named entity reco...
04/13/2021

GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition

Instead of using expensive manual annotations, researchers have proposed...
03/21/2022

Leveraging Expert Guided Adversarial Augmentation For Improving Generalization in Named Entity Recognition

Named Entity Recognition (NER) systems often demonstrate great performan...
10/26/2020

Using Unlabeled Texts for Named-Entity Recognition

Named Entity Recognition (NER) poses the problem of learning with multip...
02/09/2023

Data Augmentation for Robust Character Detection in Fantasy Novels

Named Entity Recognition (NER) is a low-level task often used as a found...
09/22/2018

A Byte-sized Approach to Named Entity Recognition

In biomedical literature, it is common for entity boundaries to not alig...
02/14/2016

Exploiting Lists of Names for Named Entity Identification of Financial Institutions from Unstructured Documents

There is a wealth of information about financial systems that is embedde...