AEDA: An Easier Data Augmentation Technique for Text Classification

08/30/2021
by   Akbar Karimi, et al.
0

This paper proposes AEDA (An Easier Data Augmentation) technique to help improve the performance on text classification tasks. AEDA includes only random insertion of punctuation marks into the original text. This is an easier technique to implement for data augmentation than EDA method (Wei and Zou, 2019) with which we compare our results. In addition, it keeps the order of the words while changing their positions in the sentence leading to a better generalized performance. Furthermore, the deletion operation in EDA can cause loss of information which, in turn, misleads the network, whereas AEDA preserves all the input information. Following the baseline, we perform experiments on five different datasets for text classification. We show that using the AEDA-augmented data for training, the models show superior performance compared to using the EDA-augmented data in all five datasets. The source code is available for further study and reproduction of the results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2019

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

We present EDA: easy data augmentation techniques for boosting performan...
research
12/16/2021

ALP: Data Augmentation using Lexicalized PCFGs for Few-Shot Text Classification

Data augmentation has been an important ingredient for boosting performa...
research
09/04/2022

Selective Text Augmentation with Word Roles for Low-Resource Text Classification

Data augmentation techniques are widely used in text classification task...
research
04/05/2023

Performance of Data Augmentation Methods for Brazilian Portuguese Text Classification

Improving machine learning performance while increasing model generaliza...
research
09/01/2021

What Have Been Learned What Should Be Learned? An Empirical Study of How to Selectively Augment Text for Classification

Text augmentation techniques are widely used in text classification prob...
research
05/28/2021

Not Far Away, Not So Close: Sample Efficient Nearest Neighbour Data Augmentation via MiniMax

In Natural Language Processing (NLP), finding data augmentation techniqu...
research
05/02/2020

On the Generalization Effects of Linear Transformations in Data Augmentation

Data augmentation is a powerful technique to improve performance in appl...

Please sign up or login with your details

Forgot password? Click here to reset