Time-Domain Speech Enhancement for Robust Automatic Speech Recognition

10/24/2022
by   Yufeng Yang, et al.
0

It has been shown that the intelligibility of noisy speech can be improved by speech enhancement algorithms. However, speech enhancement has not been established as an effective front-end for robust automatic speech recognition (ASR) in comparison with an ASR model trained on noisy speech directly. The divide between speech enhancement and ASR impedes the progress of robust ASR systems especially as speech enhancement has made big strides in recent years. In this work, we focus on eliminating such divide with an ARN (attentive recurrent network) based time-domain enhancement model. The proposed system fully decouples speech enhancement and an acoustic model trained only on clean speech. Results on the CHiME-2 corpus show that ARN enhanced speech translates to improved ASR results. The proposed system achieves 6.28% average word error rate, outperforming the previous best by 19.3%.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2019

Bridging the Gap Between Monaural Speech Enhancement and Recognition with Distortion-Independent Acoustic Modeling

Monaural speech enhancement has made dramatic advances since the introdu...
research
10/29/2019

Does Speech enhancement of publicly available data help build robust Speech Recognition Systems?

Automatic speech recognition (ASR) systems play a key role in many comme...
research
08/22/2023

Convoifilter: A case study of doing cocktail party speech recognition

This paper presents an end-to-end model designed to improve automatic sp...
research
11/16/2021

Unsupervised Speech Enhancement with speech recognition embedding and disentanglement losses

Speech enhancement has recently achieved great success with various deep...
research
11/07/2020

Dual Application of Speech Enhancement for Automatic Speech Recognition

In this work, we exploit speech enhancement for improving a recurrent ne...
research
02/11/2021

An Investigation of End-to-End Models for Robust Speech Recognition

End-to-end models for robust automatic speech recognition (ASR) have not...
research
06/18/2019

Deep Xi as a Front-End for Robust Automatic Speech Recognition

Front-end techniques for robust automatic speech recognition (ASR) have ...

Please sign up or login with your details

Forgot password? Click here to reset