EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

01/31/2019
by   Jason W. Wei, et al.
0

We present EDA: easy data augmentation techniques for boosting performance on text classification tasks. EDA consists of four simple but powerful operations: synonym replacement, random insertion, random swap, and random deletion. On five text classification tasks, we show that EDA improves performance for both convolutional and recurrent neural networks. EDA demonstrates particularly strong results for smaller datasets; on average, across five datasets, training with EDA while using only 50 accuracy as normal training with all available data. We also performed extensive ablation studies and suggest parameters for practical use.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/12/2021

Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning

Few-shot text classification is a fundamental NLP task in which a model ...
08/30/2021

AEDA: An Easier Data Augmentation Technique for Text Classification

This paper proposes AEDA (An Easier Data Augmentation) technique to help...
01/28/2022

You Only Cut Once: Boosting Data Augmentation with a Single Cut

We present You Only Cut Once (YOCO) for performing data augmentations. Y...
06/01/2020

Concept Matching for Low-Resource Classification

We propose a model to tackle classification tasks in the presence of ver...
12/16/2021

ALP: Data Augmentation using Lexicalized PCFGs for Few-Shot Text Classification

Data augmentation has been an important ingredient for boosting performa...
04/16/2021

Data Augmentation for Voice-Assistant NLU using BERT-based Interchangeable Rephrase

We introduce a data augmentation technique based on byte pair encoding a...
05/02/2020

On the Generalization Effects of Linear Transformations in Data Augmentation

Data augmentation is a powerful technique to improve performance in appl...