Named Entity Recognition with Partially Annotated Training Data

09/20/2019
by   Stephen Mayhew, et al.
0

Supervised machine learning assumes the availability of fully-labeled data, but in many cases, such as low-resource languages, the only data available is partially annotated. We study the problem of Named Entity Recognition (NER) with partially annotated training data in which a fraction of the named entities are labeled, and all other tokens, entities or otherwise, are labeled as non-entity by default. In order to train on this noisy dataset, we need to distinguish between the true and false negatives. To this end, we introduce a constraint-driven iterative algorithm that learns to detect false negatives in the noisy set and downweigh them, resulting in a weighted training set. With this set, we train a weighted NER model. We evaluate our algorithm with weighted variants of neural and non-neural NER models on data in 8 languages from several language and script families, showing strong ability to learn from partial data. Finally, to show real-world efficacy, we evaluate on a Bengali NER corpus annotated by non-speakers, outperforming the prior state-of-the-art by over 5 points F1.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/11/2021

AdaK-NER: An Adaptive Top-K Approach for Named Entity Recognition with Incomplete Annotations

State-of-the-art Named Entity Recognition(NER) models rely heavily on la...
research
05/22/2023

Better Sampling of Negatives for Distantly Supervised Named Entity Recognition

Distantly supervised named entity recognition (DS-NER) has been proposed...
research
05/22/2023

Taxonomy Expansion for Named Entity Recognition

Training a Named Entity Recognition (NER) model often involves fixing a ...
research
04/19/2022

Named Entity Recognition for Partially Annotated Datasets

The most common Named Entity Recognizers are usually sequence taggers tr...
research
05/01/2020

Partially-Typed NER Datasets Integration: Connecting Practice to Theory

While typical named entity recognition (NER) models require the training...
research
05/21/2023

A Confidence-based Partial Label Learning Model for Crowd-Annotated Named Entity Recognition

Existing models for named entity recognition (NER) are mainly based on l...
research
01/19/2021

Single versus Multiple Annotation for Named Entity Recognition of Mutations

The focus of this paper is to address the knowledge acquisition bottlene...

Please sign up or login with your details

Forgot password? Click here to reset