Correlation Distance Skip Connection Denoising Autoencoder (CDSK-DAE) for Speech Feature Enhancement

07/26/2019
by   Alzahra Badi, et al.
0

Performance of learning based Automatic Speech Recognition (ASR) is susceptible to noise, especially when it is introduced in the testing data while not presented in the training data. This work focuses on a feature enhancement for noise robust end-to-end ASR system by introducing a novel variant of denoising autoencoder (DAE). The proposed method uses skip connections in both encoder and decoder sides by passing speech information of the target frame from input to the model. It also uses a new objective function in training model that uses a correlation distance measure in penalty terms by measuring dependency of the latent target features and the model (latent features and enhanced features obtained from the DAE). Performance of the proposed method was compared against a conventional model and a state of the art model under both seen and unseen noisy environments of 7 different types of background noise with different SNR levels (0, 5, 10 and 20 dB). The proposed method also is tested using linear and non-linear penalty terms as well, where, they both show an improvement on the overall average WER under noisy conditions both seen and unseen in comparison to the state-of-the-art model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2023

Bring the Noise: Introducing Noise Robustness to Pretrained Automatic Speech Recognition

In recent research, in the domain of speech processing, large End-to-End...
research
11/09/2020

Gated Recurrent Fusion with Joint Training Framework for Robust End-to-End Speech Recognition

The joint training framework for speech enhancement and recognition meth...
research
05/29/2023

speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition

In recent years, the joint training of speech enhancement front-end and ...
research
04/25/2022

Cleanformer: A microphone array configuration-invariant, streaming, multichannel neural enhancement frontend for ASR

This work introduces the Cleanformer, a streaming multichannel neural ba...
research
06/22/2016

A Curriculum Learning Method for Improved Noise Robustness in Automatic Speech Recognition

The performance of automatic speech recognition systems under noisy envi...
research
10/05/2021

Late reverberation suppression using U-nets

In real-world settings, speech signals are almost always affected by rev...
research
01/20/2021

Noise Learning Based Denoising Autoencoder

This letter introduces a new denoiser that modifies the structure of den...

Please sign up or login with your details

Forgot password? Click here to reset