Validating Label Consistency in NER Data Annotation

01/21/2021
by   Qingkai Zeng, et al.
0

Data annotation plays a crucial role in ensuring your named entity recognition (NER) projects are trained with the right information to learn from. Producing the most accurate labels is a challenge due to the complexity involved with annotation. Label inconsistency between multiple subsets of data annotation (e.g., training set and test set, or multiple training subsets) is an indicator of label mistakes. In this work, we present an empirical method to explore the relationship between label (in-)consistency and NER model performance. It can be used to validate the label consistency (or catches the inconsistency) in multiple sets of NER data annotation. In experiments, our method identified the label inconsistency of test data in SCIERC and CoNLL03 datasets (with 26.7 the corrected version of both datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2019

CrossWeigh: Training Named Entity Tagger from Imperfect Annotations

Everyone makes mistakes. So do human annotators when curating labels for...
research
04/17/2023

K-means Clustering Based Feature Consistency Alignment for Label-free Model Evaluation

The label-free model evaluation aims to predict the model performance on...
research
10/24/2022

Enhancing Label Consistency on Document-level Named Entity Recognition

Named entity recognition (NER) is a fundamental part of extracting infor...
research
07/14/2011

Label-Specific Training Set Construction from Web Resource for Image Annotation

Recently many research efforts have been devoted to image annotation by ...
research
08/07/2022

SciAnnotate: A Tool for Integrating Weak Labeling Sources for Sequence Labeling

Weak labeling is a popular weak supervision strategy for Named Entity Re...
research
11/25/2022

Finetuning BERT on Partially Annotated NER Corpora

Most Named Entity Recognition (NER) models operate under the assumption ...
research
03/27/2019

ner and pos when nothing is capitalized

For those languages which use it, capitalization is an important signal ...

Please sign up or login with your details

Forgot password? Click here to reset