Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition

04/08/2022
by   Zehai Tu, et al.
0

End-to-end models have achieved significant improvement on automatic speech recognition. One common method to improve performance of these models is expanding the data-space through data augmentation. Meanwhile, human auditory inspired front-ends have also demonstrated improvement for automatic speech recognisers. In this work, a well-verified auditory-based model, which can simulate various hearing abilities, is investigated for the purpose of data augmentation for end-to-end speech recognition. By introducing the auditory model into the data augmentation process, end-to-end systems are encouraged to ignore variation from the signal that cannot be heard and thereby focus on robust features for speech recognition. Two mechanisms in the auditory model, spectral smearing and loudness recruitment, are studied on the LibriSpeech dataset with a transformer-based end-to-end model. The results show that the proposed augmentation methods can bring statistically significant improvement on the performance of the state-of-the-art SpecAugment.

READ FULL TEXT
research
02/25/2021

MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition

In this paper, we propose MixSpeech, a simple yet effective data augment...
research
04/18/2019

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

We present SpecAugment, a simple data augmentation method for speech rec...
research
12/19/2017

Improved Regularization Techniques for End-to-End Speech Recognition

Regularization is important for end-to-end speech models, since the mode...
research
12/22/2019

end-to-end training of a large vocabulary end-to-end speech recognition system

In this paper, we present an end-to-end training framework for building ...
research
12/11/2019

SpecAugment on Large Scale Datasets

Recently, SpecAugment, an augmentation scheme for automatic speech recog...
research
01/16/2023

BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition

Recent developments using End-to-End Deep Learning models have been show...

Please sign up or login with your details

Forgot password? Click here to reset