How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?

10/05/2020
by   Shayne Longpre, et al.
18

Task-agnostic forms of data augmentation have proven widely effective in computer vision, even on pretrained models. In NLP similar results are reported most commonly for low data regimes, non-pretrained models, or situationally for pretrained models. In this paper we ask how effective these techniques really are when applied to pretrained transformers. Using two popular varieties of task-agnostic data augmentation (not tailored to any particular task), Easy Data Augmentation (Wei and Zou, 2019) and Back-Translation (Sennrichet al., 2015), we conduct a systematic examination of their effects across 5 classification tasks, 6 datasets, and 3 variants of modern pretrained transformers, including BERT, XLNet, and RoBERTa. We observe a negative result, finding that techniques which previously reported strong improvements for non-pretrained models fail to consistently improve performance for pretrained transformers, even when training data is limited. We hope this empirical analysis helps inform practitioners where data augmentation techniques may confer improvements.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/05/2023

Effectiveness of Data Augmentation for Prefix Tuning with Limited Data

Recent work has demonstrated that tuning continuous prompts on large, fr...
research
03/03/2023

Exploring Data Augmentation Methods on Social Media Corpora

Data augmentation has proven widely effective in computer vision. In Nat...
research
04/25/2023

Hint-Aug: Drawing Hints from Foundation Vision Transformers Towards Boosted Few-Shot Parameter-Efficient Tuning

Despite the growing demand for tuning foundation vision transformers (FV...
research
04/13/2020

Pretrained Transformers Improve Out-of-Distribution Robustness

Although pretrained Transformers such as BERT achieve high accuracy on i...
research
03/25/2023

Deep Augmentation: Enhancing Self-Supervised Learning through Transformations in Higher Activation Space

We introduce Deep Augmentation, an approach to data augmentation using d...
research
11/29/2022

RGB no more: Minimally-decoded JPEG Vision Transformers

Most neural networks for computer vision are designed to infer using RGB...
research
04/18/2021

Contrastive Out-of-Distribution Detection for Pretrained Transformers

Pretrained transformers achieve remarkable performance when the test dat...

Please sign up or login with your details

Forgot password? Click here to reset