Focusing on Possible Named Entities in Active Named Entity Label Acquisition

by   Ali Osman Berk Sapci, et al.

Named entity recognition (NER) aims to identify mentions of named entities in an unstructured text and classify them into the predefined named entity classes. Even though deep learning-based pre-trained language models achieve good predictive performances, many domain-specific NERtasks still require a sufficient amount of labeled data. Active learning (AL), a general framework for the label acquisition problem, has been used for the NER tasks to minimize the annotation cost without sacrificing model performance. However, heavily imbalanced class distribution of tokens introduces challenges in designing effective AL querying methods for NER. We propose AL sentence query evaluation functions which pay more attention to possible positive tokens, and evaluate these proposed functions with both sentence-based and token-based cost evaluation strategies. We also propose a better data-driven normalization approach to penalize too long or too short sentences. Our experiments on three datasets from different domains reveal that the proposed approaches reduce the number of annotated tokens while achieving better or comparable prediction performance with conventional methods.



There are no comments yet.


page 3

page 4

page 6

page 8

page 9

page 13

page 16

page 17


AdaK-NER: An Adaptive Top-K Approach for Named Entity Recognition with Incomplete Annotations

State-of-the-art Named Entity Recognition(NER) models rely heavily on la...

On the Strength of Character Language Models for Multilingual Named Entity Recognition

Character-level patterns have been widely used as features in English Na...

ANEA: Automated (Named) Entity Annotation for German Domain-Specific Texts

Named entity recognition (NER) is an important task that aims to resolve...

Named Entity Recognition with Partially Annotated Training Data

Supervised machine learning assumes the availability of fully-labeled da...

Unlocking the Power of Deep PICO Extraction: Step-wise Medical NER Identification

The PICO framework (Population, Intervention, Comparison, and Outcome) i...

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

In many scenarios, named entity recognition (NER) models severely suffer...

Overcoming Practical Issues of Deep Active Learning and its Applications on Named Entity Recognition

Existing deep active learning algorithms achieve impressive sampling eff...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.