Denoising Distantly Supervised Named Entity Recognition via a Hypergeometric Probabilistic Model

06/17/2021
by   Wenkai Zhang, et al.
0

Denoising is the essential step for distant supervision based named entity recognition. Previous denoising methods are mostly based on instance-level confidence statistics, which ignore the variety of the underlying noise distribution on different datasets and entity types. This makes them difficult to be adapted to high noise rate settings. In this paper, we propose Hypergeometric Learning (HGL), a denoising algorithm for distantly supervised NER that takes both noise distribution and instance-level confidence into consideration. Specifically, during neural network training, we naturally model the noise samples in each batch following a hypergeometric distribution parameterized by the noise-rate. Then each instance in the batch is regarded as either correct or noisy one according to its label confidence derived from previous training step, as well as the noise distribution in this sampled batch. Experiments show that HGL can effectively denoise the weakly-labeled data retrieved from distant supervision, and therefore results in significant improvements on the trained models.

READ FULL TEXT
research
09/10/2021

Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

We study the problem of training named entity recognition (NER) models u...
research
10/09/2021

Improving Distantly-Supervised Named Entity Recognition with Self-Collaborative Denoising Learning

Distantly supervised named entity recognition (DS-NER) efficiently reduc...
research
05/24/2023

CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition

Cross-lingual named entity recognition (NER) aims to train an NER system...
research
06/28/2020

BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision

We study the open-domain named entity recognition (NER) problem under di...
research
03/18/2020

Distant Supervision and Noisy Label Learning for Low Resource Named Entity Recognition: A Study on Hausa and Yorùbá

The lack of labeled training data has limited the development of natural...
research
04/09/2021

Noisy-Labeled NER with Confidence Estimation

Recent studies in deep learning have shown significant progress in named...
research
10/18/2022

Denoising Enhanced Distantly Supervised Ultrafine Entity Typing

Recently, the task of distantly supervised (DS) ultra-fine entity typing...

Please sign up or login with your details

Forgot password? Click here to reset