Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition

10/14/2022
by   Shuguang Chen, et al.
0

In this work, we take the named entity recognition task in the English language as a case study and explore style transfer as a data augmentation method to increase the size and diversity of training data in low-resource scenarios. We propose a new method to effectively transform the text from a high-resource domain to a low-resource domain by changing its style-related attributes to generate synthetic data for training. Moreover, we design a constrained decoding algorithm along with a set of key ingredients for data selection to guarantee the generation of valid and coherent data. Experiments and analysis on five different domain pairs under different data regimes demonstrate that our approach can significantly improve results compared to current state-of-the-art data augmentation methods. Our approach is a practical solution to data scarcity, and we expect it to be applicable to other NLP tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/04/2021

Data Augmentation for Cross-Domain Named Entity Recognition

Current work in named entity recognition (NER) shows that data augmentat...
research
08/26/2021

Data Augmentation for Low-Resource Named Entity Recognition Using Backtranslation

The state of art natural language processing systems relies on sizable t...
research
08/15/2022

Syntax-driven Data Augmentation for Named Entity Recognition

In low resource settings, data augmentation strategies are commonly leve...
research
05/18/2023

BioAug: Conditional Generation based Data Augmentation for Low-Resource Biomedical NER

Biomedical Named Entity Recognition (BioNER) is the fundamental task of ...
research
01/22/2021

Rethinking Domain Generalization Baselines

Despite being very powerful in standard learning settings, deep learning...
research
04/28/2020

Deflating Dataset Bias Using Synthetic Data Augmentation

Deep Learning has seen an unprecedented increase in vision applications ...
research
12/04/2020

Delexicalized Paraphrase Generation

We present a neural model for paraphrasing and train it to generate dele...

Please sign up or login with your details

Forgot password? Click here to reset