Named Entity Recognition for Partially Annotated Datasets

04/19/2022
by   Michael Strobl, et al.
5

The most common Named Entity Recognizers are usually sequence taggers trained on fully annotated corpora, i.e. the class of all words for all entities is known. Partially annotated corpora, i.e. some but not all entities of some types are annotated, are too noisy for training sequence taggers since the same entity may be annotated one time with its true type but not another time, misleading the tagger. Therefore, we are comparing three training strategies for partially annotated datasets and an approach to derive new datasets for new classes of entities from Wikipedia without time-consuming manual data annotation. In order to properly verify that our data acquisition and training approaches are plausible, we manually annotated test datasets for two new classes, namely food and drugs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/25/2019

Learning A Unified Named Entity Tagger From Multiple Partially Annotated Corpora For Efficient Adaptation

Named entity recognition (NER) identifies typed entity mentions in raw t...
research
09/20/2019

Named Entity Recognition with Partially Annotated Training Data

Supervised machine learning assumes the availability of fully-labeled da...
research
11/25/2022

Finetuning BERT on Partially Annotated NER Corpora

Most Named Entity Recognition (NER) models operate under the assumption ...
research
05/22/2023

Aligning the Norwegian UD Treebank with Entity and Coreference Information

This paper presents a merged collection of entity and coreference annota...
research
05/01/2020

Partially-Typed NER Datasets Integration: Connecting Practice to Theory

While typical named entity recognition (NER) models require the training...
research
12/10/2020

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

In many scenarios, named entity recognition (NER) models severely suffer...
research
01/19/2021

Single versus Multiple Annotation for Named Entity Recognition of Mutations

The focus of this paper is to address the knowledge acquisition bottlene...

Please sign up or login with your details

Forgot password? Click here to reset