A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASR

10/16/2022
by   Rui Li, et al.
0

SpecAugment is a very effective data augmentation method for both HMM and E2E-based automatic speech recognition (ASR) systems. Especially, it also works in low-resource scenarios. However, SpecAugment masks the spectrum of time or the frequency domain in a fixed augmentation policy, which may bring relatively less data diversity to the low-resource ASR. In this paper, we propose a policy-based SpecAugment (Policy-SpecAugment) method to alleviate the above problem. The idea is to use the augmentation-select policy and the augmentation-parameter changing policy to solve the fixed way. These policies are learned based on the loss of validation set, which is applied to the corresponding augmentation policies. It aims to encourage the model to learn more diverse data, which the model relatively requires. In experiments, we evaluate the effectiveness of our approach in low-resource scenarios, i.e., the 100 hours librispeech task. According to the results and analysis, we can see that the above issue can be obviously alleviated using our proposal. In addition, the experimental results show that, compared with the state-of-the-art SpecAugment, the proposed Policy-SpecAugment has a relative WER reduction of more than 10 Test/Dev-other set, and an absolute WER reduction of more than 1 sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2021

MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition

In this paper, we propose MixSpeech, a simple yet effective data augment...
research
07/14/2022

Data Augmentation for Low-Resource Quechua ASR Improvement

Automatic Speech Recognition (ASR) is a key element in new services that...
research
12/10/2018

Low Resource Multi-modal Data Augmentation for End-to-end ASR

We explore training attention-based encoder-decoder ASR for low-resource...
research
05/02/2023

The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge

This paper describes our system for the low-resource domain adaptation t...
research
11/02/2020

SapAugment: Learning A Sample Adaptive Policy for Data Augmentation

Data augmentation methods usually apply the same augmentation (or a mix ...
research
09/24/2021

A Diversity-Enhanced and Constraints-Relaxed Augmentation for Low-Resource Classification

Data augmentation (DA) aims to generate constrained and diversified data...
research
01/06/2023

Mask-then-Fill: A Flexible and Effective Data Augmentation Framework for Event Extraction

We present Mask-then-Fill, a flexible and effective data augmentation fr...

Please sign up or login with your details

Forgot password? Click here to reset