SapAugment: Learning A Sample Adaptive Policy for Data Augmentation

11/02/2020
by   Ting-yao Hu, et al.
0

Data augmentation methods usually apply the same augmentation (or a mix of them) to all the training samples. For example, to perturb data with noise, the noise is sampled from a Normal distribution with a fixed standard deviation, for all samples. We hypothesize that a hard sample with high training loss already provides strong training signal to update the model parameters and should be perturbed with mild or no augmentation. Perturbing a hard sample with a strong augmentation may also make it too hard to learn from. Furthermore, a sample with low training loss should be perturbed by a stronger augmentation to provide more robustness to a variety of conditions. To formalize these intuitions, we propose a novel method to learn a Sample-Adaptive Policy for Augmentation – SapAugment. Our policy adapts the augmentation parameters based on the training loss of the data samples. In the example of Gaussian noise, a hard sample will be perturbed with a low variance noise and an easy sample with a high variance noise. Furthermore, the proposed method combines multiple augmentation methods into a methodical policy learning framework and obviates hand-crafting augmentation parameters by trial-and-error. We apply our method on an automatic speech recognition (ASR) task, and combine existing and novel augmentations using the proposed framework. We show substantial improvement, up to 21 state-of-the-art speech augmentation method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2020

Self-paced Data Augmentation for Training Neural Networks

Data augmentation is widely used for machine learning; however, an effec...
research
02/27/2023

A Comparison of Speech Data Augmentation Methods Using S3PRL Toolkit

Data augmentations are known to improve robustness in speech-processing ...
research
10/16/2022

A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASR

SpecAugment is a very effective data augmentation method for both HMM an...
research
04/03/2021

On-the-Fly Aligned Data Augmentation for Sequence-to-Sequence ASR

We propose an on-the-fly data augmentation method for automatic speech r...
research
03/09/2023

An Improved Data Augmentation Scheme for Model Predictive Control Policy Approximation

This paper considers the problem of data generation for MPC policy appro...
research
07/17/2018

Learning Noise-Invariant Representations for Robust Speech Recognition

Despite rapid advances in speech recognition, current models remain brit...
research
10/28/2022

Rawgment: Noise-Accounted RAW Augmentation Enables Recognition in a Wide Variety of Environments

Image recognition models that can work in challenging environments (e.g....

Please sign up or login with your details

Forgot password? Click here to reset