On Cross-Corpus Generalization of Deep Learning Based Speech Enhancement

02/10/2020
by   Ashutosh Pandey, et al.
0

In recent years, supervised approaches using deep neural networks (DNNs) have become the mainstream for speech enhancement. It has been established that DNNs generalize well to untrained noises and speakers if trained using a large number of noises and speakers. However, we find that DNNs fail to generalize to new speech corpora in low signal-to-noise ratio (SNR) conditions. In this work, we establish that the lack of generalization is mainly due to the channel mismatch between the trained and untrained corpus. Additionally, we observe that traditional channel normalization techniques are not effective in improving cross-corpus generalization. Further, we evaluate publicly available datasets that are promising for generalization. We find one particular corpus to be significantly better than others. Finally, we find that using a smaller frame shift in short-time processing of speech can significantly improve cross-corpus generalization. The proposed techniques to address cross-corpus generalization include channel normalization, better training corpus, and smaller frame shift in short-time Fourier transform (STFT). These techniques together improve the objective intelligibility and quality scores on untrained corpora significantly.

READ FULL TEXT

page 1

page 5

page 6

page 7

page 8

page 9

research
05/26/2021

Self-attending RNN for Speech Enhancement to Improve Cross-corpus Generalization

Deep neural networks (DNNs) represent the mainstream methodology for sup...
research
04/07/2020

SNR-Based Features and Diverse Training Data for Robust DNN-Based Speech Enhancement

This paper analyzes the generalization of speech enhancement algorithms ...
research
09/07/2017

Normalized Features for Improving the Generalization of DNN Based Speech Enhancement

Enhancing noisy speech is an important task to restore its quality and t...
research
09/07/2017

Improving the Generalizability of Deep Neural Network Based Speech Enhancement

Enhancing noisy speech is an important task to restore its quality and t...
research
05/29/2019

Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect

When speaking in presence of background noise, humans reflexively change...
research
05/10/2021

Cross-Corpora Language Recognition: A Preliminary Investigation with Indian Languages

In this paper, we conduct one of the very first studies for cross-corpor...
research
06/01/2018

DNN Based Speech Enhancement for Unseen Noises Using Monte Carlo Dropout

In this work, we propose the use of dropouts as a Bayesian estimator for...

Please sign up or login with your details

Forgot password? Click here to reset