Data Augmentation for Cross-Domain Named Entity Recognition

09/04/2021
by   Shuguang Chen, et al.
5

Current work in named entity recognition (NER) shows that data augmentation techniques can produce more robust models. However, most existing techniques focus on augmenting in-domain data in low-resource scenarios where annotated data is quite limited. In contrast, we study cross-domain data augmentation for the NER task. We investigate the possibility of leveraging data from high-resource domains by projecting it into the low-resource domains. Specifically, we propose a novel neural architecture to transform the data representation from a high-resource to a low-resource domain by learning the patterns (e.g. style, noise, abbreviations, etc.) in the text that differentiate them and a shared feature space where both domains are aligned. We experiment with diverse datasets and show that transforming the data to the low-resource domain representation achieves significant improvements over only using data from high-resource domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/26/2021

Data Augmentation for Low-Resource Named Entity Recognition Using Backtranslation

The state of art natural language processing systems relies on sizable t...
research
08/24/2022

FactMix: Using a Few Labeled In-domain Examples to Generalize to Cross-domain Named Entity Recognition

Few-shot Named Entity Recognition (NER) is imperative for entity tagging...
research
10/14/2022

Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition

In this work, we take the named entity recognition task in the English l...
research
06/01/2023

ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER

Complex Named Entity Recognition (NER) is the task of detecting linguist...
research
01/02/2021

A Robust and Domain-Adaptive Approach for Low-Resource Named Entity Recognition

Recently, it has attracted much attention to build reliable named entity...
research
12/09/2022

AUC Maximization for Low-Resource Named Entity Recognition

Current work in named entity recognition (NER) uses either cross entropy...
research
10/14/2019

Feature-Dependent Confusion Matrices for Low-Resource NER Labeling with Noisy Labels

In low-resource settings, the performance of supervised labeling models ...

Please sign up or login with your details

Forgot password? Click here to reset