CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition

05/24/2023
by   Tingting Ma, et al.
0

Cross-lingual named entity recognition (NER) aims to train an NER system that generalizes well to a target language by leveraging labeled data in a given source language. Previous work alleviates the data scarcity problem by translating source-language labeled data or performing knowledge distillation on target-language unlabeled data. However, these methods may suffer from label noise due to the automatic labeling process. In this paper, we propose CoLaDa, a Collaborative Label Denoising Framework, to address this problem. Specifically, we first explore a model-collaboration-based denoising scheme that enables models trained on different data sources to collaboratively denoise pseudo labels used by each other. We then present an instance-collaboration-based strategy that considers the label consistency of each token's neighborhood in the representation space for denoising. Experiments on different benchmark datasets show that the proposed CoLaDa achieves superior results compared to previous methods, especially when generalizing to distant languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/17/2022

ConNER: Consistency Training for Cross-lingual Named Entity Recognition

Cross-lingual named entity recognition (NER) suffers from data scarcity ...
research
04/26/2020

Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language

To better tackle the named entity recognition (NER) problem on languages...
research
07/15/2020

UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data

Prior works in cross-lingual named entity recognition (NER) with no/litt...
research
06/04/2021

AdvPicker: Effectively Leveraging Unlabeled Data via Adversarial Discriminator for Cross-Lingual NER

Neural methods have been shown to achieve high performance in Named Enti...
research
06/17/2021

Denoising Distantly Supervised Named Entity Recognition via a Hypergeometric Probabilistic Model

Denoising is the essential step for distant supervision based named enti...
research
08/31/2021

MELM: Data Augmentation with Masked Entity Language Modeling for Cross-lingual NER

Data augmentation for cross-lingual NER requires fine-grained control ov...
research
08/17/2023

mCL-NER: Cross-Lingual Named Entity Recognition via Multi-view Contrastive Learning

Cross-lingual named entity recognition (CrossNER) faces challenges stemm...

Please sign up or login with your details

Forgot password? Click here to reset