Rethinking Negative Sampling for Unlabeled Entity Problem in Named Entity Recognition

08/26/2021
by   Yangming Li, et al.
0

In many situations (e.g., distant supervision), unlabeled entity problem seriously degrades the performances of named entity recognition (NER) models. Recently, this issue has been well addressed by a notable approach based on negative sampling. In this work, we perform two studies along this direction. Firstly, we analyze why negative sampling succeeds both theoretically and empirically. Based on the observation that named entities are highly sparse in datasets, we show a theoretical guarantee that, for a long sentence, the probability of containing no unlabeled entities in sampled negatives is high. Missampling tests on synthetic datasets have verified our guarantee in practice. Secondly, to mine hard negatives and further reduce missampling rates, we propose a weighted and adaptive sampling distribution for negative sampling. Experiments on synthetic datasets and well-annotated datasets show that our method significantly improves negative sampling in robustness and effectiveness. We also have achieved new state-of-the-art results on real-world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/10/2020

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

In many scenarios, named entity recognition (NER) models severely suffer...
research
08/05/2022

A Noise-Robust Loss for Unlabeled Entity Problem in Named Entity Recognition

Named Entity Recognition (NER) is an important task in natural language ...
research
05/22/2023

Better Sampling of Negatives for Distantly Supervised Named Entity Recognition

Distantly supervised named entity recognition (DS-NER) has been proposed...
research
09/04/2022

SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER

Named Entity Recognition is the task to locate and classify the entities...
research
04/15/2021

Regularizing Models via Pointwise Mutual Information for Named Entity Recognition

In Named Entity Recognition (NER), pre-trained language models have been...
research
01/31/2017

Robust Multilingual Named Entity Recognition with Shallow Semi-Supervised Features

We present a multilingual Named Entity Recognition approach based on a r...
research
09/07/2020

Simplify and Robustify Negative Sampling for Implicit Collaborative Filtering

Negative sampling approaches are prevalent in implicit collaborative fil...

Please sign up or login with your details

Forgot password? Click here to reset