SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup

10/05/2020
by   Rongzhi Zhang, et al.
0

Active learning is an important technique for low-resource sequence labeling tasks. However, current active sequence labeling methods use the queried samples alone in each iteration, which is an inefficient way of leveraging human annotations. We propose a simple but effective data augmentation method to improve the label efficiency of active sequence labeling. Our method, SeqMix, simply augments the queried samples by generating extra labeled sequences in each iteration. The key difficulty is to generate plausible sequences along with token-level labels. In SeqMix, we address this challenge by performing mixup for both sequences and token-level labels of the queried samples. Furthermore, we design a discriminator during sequence mixup, which judges whether the generated sequences are plausible or not. Our experiments on Named Entity Recognition and Event Detection tasks show that SeqMix can improve the standard active sequence labeling method by 2.27%–3.75% in terms of F_1 scores. The code and data for SeqMix can be found at https://github.com/rz-zhang/SeqMix

READ FULL TEXT
research
10/22/2020

An Analysis of Simple Data Augmentation for Named Entity Recognition

Simple yet effective data augmentation techniques have been proposed for...
research
10/09/2020

iobes: A Library for Span-Level Processing

Many tasks in natural language processing, such as named entity recognit...
research
05/19/2023

Enhancing Few-shot NER with Prompt Ordering based Data Augmentation

Recently, data augmentation (DA) methods have been proven to be effectiv...
research
01/08/2020

LTP: A New Active Learning Strategy for Bert-CRF Based Named Entity Recognition

In recent years, deep learning has achieved great success in many natura...
research
05/14/2020

NAT: Noise-Aware Training for Robust Neural Sequence Labeling

Sequence labeling systems should perform reliably not only under ideal c...
research
09/15/2020

Augmented Natural Language for Generative Sequence Labeling

We propose a generative framework for joint sequence labeling and senten...
research
06/06/2022

Global Mixup: Eliminating Ambiguity with Clustering

Data augmentation with Mixup has been proven an effective method to regu...

Please sign up or login with your details

Forgot password? Click here to reset