BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition

05/26/2021
by   Yinghao Li, et al.
10

We study the problem of learning a named entity recognition (NER) tagger using noisy labels from multiple weak supervision sources. Though cheap to obtain, the labels from weak supervision sources are often incomplete, inaccurate, and contradictory, making it difficult to learn an accurate NER model. To address this challenge, we propose a conditional hidden Markov model (CHMM), which can effectively infer true labels from multi-source noisy labels in an unsupervised way. CHMM enhances the classic hidden Markov model with the contextual representation power of pre-trained language models. Specifically, CHMM learns token-wise transition and emission probabilities from the BERT embeddings of the input tokens to infer the latent true labels from noisy observations. We further refine CHMM with an alternate-training approach (CHMM-ALT). It fine-tunes a BERT-NER model with the labels inferred by CHMM, and this BERT-NER's output is regarded as an additional weak source to train the CHMM in return. Experiments on four NER benchmarks from various domains show that our method outperforms state-of-the-art weakly supervised NER models by wide margins.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2020

BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision

We study the open-domain named entity recognition (NER) problem under di...
research
05/27/2022

Sparse Conditional Hidden Markov Model for Weakly Supervised Named Entity Recognition

Weakly supervised named entity recognition methods train label models to...
research
08/07/2022

SciAnnotate: A Tool for Integrating Weak Labeling Sources for Sequence Labeling

Weak labeling is a popular weak supervision strategy for Named Entity Re...
research
07/08/2017

Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection

The state-of-the-art named entity recognition (NER) systems are supervis...
research
05/10/2023

Extracting Complex Named Entities in Legal Documents via Weakly Supervised Object Detection

Accurate Named Entity Recognition (NER) is crucial for various informati...
research
10/14/2022

Automatic Creation of Named Entity Recognition Datasets by Querying Phrase Representations

Most weakly supervised named entity recognition (NER) models rely on dom...
research
03/16/2022

Label Semantics for Few Shot Named Entity Recognition

We study the problem of few shot learning for named entity recognition. ...

Please sign up or login with your details

Forgot password? Click here to reset