Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation

12/05/2020
by   Ruibo Liu, et al.
16

Data augmentation is proven to be effective in many NLU tasks, especially for those suffering from data scarcity. In this paper, we present a powerful and easy to deploy text augmentation framework, Data Boost, which augments data through reinforcement learning guided conditional generation. We evaluate Data Boost on three diverse text classification tasks under five different classifier architectures. The result shows that Data Boost can boost the performance of classifiers especially in low-resource data scenarios. For instance, Data Boost improves F1 for the three tasks by 8.7 given only 10 six prior text augmentation methods. Through human evaluations (N=178), we confirm that Data Boost augmentation has comparable quality as the original data with respect to readability and class consistency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/12/2021

Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning

Few-shot text classification is a fundamental NLP task in which a model ...
research
09/04/2022

Selective Text Augmentation with Word Roles for Low-Resource Text Classification

Data augmentation techniques are widely used in text classification task...
research
12/05/2020

Enhanced Offensive Language Detection Through Data Augmentation

Detecting offensive language on social media is an important task. The I...
research
01/06/2023

Mask-then-Fill: A Flexible and Effective Data Augmentation Framework for Event Extraction

We present Mask-then-Fill, a flexible and effective data augmentation fr...
research
04/24/2022

EPiDA: An Easy Plug-in Data Augmentation Framework for High Performance Text Classification

Recent works have empirically shown the effectiveness of data augmentati...
research
12/27/2021

PRIME: A Few Primitives Can Boost Robustness to Common Corruptions

Despite their impressive performance on image classification tasks, deep...
research
02/28/2022

Text Smoothing: Enhance Various Data Augmentation Methods on Text Classification Tasks

Before entering the neural network, a token is generally converted to th...

Please sign up or login with your details

Forgot password? Click here to reset