Dual Application of Speech Enhancement for Automatic Speech Recognition

11/07/2020
by   Ashutosh Pandey, et al.
0

In this work, we exploit speech enhancement for improving a recurrent neural network transducer (RNN-T) based ASR system. We employ a dense convolutional recurrent network (DCRN) for complex spectral mapping based speech enhancement, and find it helpful for ASR in two ways: a data augmentation technique, and a preprocessing frontend. In using it for ASR data augmentation, we exploit a KL divergence based consistency loss that is computed between the ASR outputs of original and enhanced utterances. In using speech enhancement as an effective ASR frontend, we propose a three-step training scheme based on model pretraining and feature selection. We evaluate our proposed techniques on a challenging social media English video dataset, and achieve an average relative improvement of 11.2 enhancement based preprocessing, and 13.4

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2022

Time-Domain Speech Enhancement for Robust Automatic Speech Recognition

It has been shown that the intelligibility of noisy speech can be improv...
research
03/11/2019

Bridging the Gap Between Monaural Speech Enhancement and Recognition with Distortion-Independent Acoustic Modeling

Monaural speech enhancement has made dramatic advances since the introdu...
research
02/11/2021

An Investigation of End-to-End Models for Robust Speech Recognition

End-to-end models for robust automatic speech recognition (ASR) have not...
research
09/15/2022

MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement

Speech enhancement improves speech quality and promotes the performance ...
research
07/22/2021

Multitask-Based Joint Learning Approach To Robust ASR For Radio Communication Speech

To realize robust end-to-end Automatic Speech Recognition(E2E ASR) under...
research
07/28/2020

Neural Kalman Filtering for Speech Enhancement

Statistical signal processing based speech enhancement methods adopt exp...
research
03/26/2018

Spectral feature mapping with mimic loss for robust speech recognition

For the task of speech enhancement, local learning objectives are agnost...

Please sign up or login with your details

Forgot password? Click here to reset