Self-attending RNN for Speech Enhancement to Improve Cross-corpus Generalization

05/26/2021
by   Ashutosh Pandey, et al.
0

Deep neural networks (DNNs) represent the mainstream methodology for supervised speech enhancement, primarily due to their capability to model complex functions using hierarchical representations. However, a recent study revealed that DNNs trained on a single corpus fail to generalize to untrained corpora, especially in low signal-to-noise ratio (SNR) conditions. Developing a noise, speaker, and corpus independent speech enhancement algorithm is essential for real-world applications. In this study, we propose a self-attending recurrent neural network(SARNN) for time-domain speech enhancement to improve cross-corpus generalization. SARNN comprises of recurrent neural networks (RNNs) augmented with self-attention blocks and feedforward blocks. We evaluate SARNN on different corpora with nonstationary noises in low SNR conditions. Experimental results demonstrate that SARNN substantially outperforms competitive approaches to time-domain speech enhancement, such as RNNs and dual-path SARNNs. Additionally, we report an important finding that the two popular approaches to speech enhancement: complex spectral mapping and time-domain enhancement, obtain similar results for RNN and SARNN with large-scale training. We also provide a challenging subset of the test set used in this study for evaluating future algorithms and facilitating direct comparisons.

READ FULL TEXT

page 1

page 6

page 7

research
02/10/2020

On Cross-Corpus Generalization of Deep Learning Based Speech Enhancement

In recent years, supervised approaches using deep neural networks (DNNs)...
research
01/14/2020

Robust Speaker Recognition Using Speech Enhancement And Attention Model

In this paper, a novel architecture for speaker recognition is proposed ...
research
04/07/2020

SNR-Based Features and Diverse Training Data for Robust DNN-Based Speech Enhancement

This paper analyzes the generalization of speech enhancement algorithms ...
research
09/12/2023

Assessing the Generalization Gap of Learning-Based Speech Enhancement Systems in Noisy and Reverberant Environments

The acoustic variability of noisy and reverberant speech mixtures is inf...
research
05/20/2020

TinyLSTMs: Efficient Neural Speech Enhancement for Hearing Aids

Modern speech enhancement algorithms achieve remarkable noise suppressio...
research
10/20/2021

TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement

In this work, we propose a new model called triple-path attentive recurr...
research
09/07/2017

Improving the Generalizability of Deep Neural Network Based Speech Enhancement

Enhancing noisy speech is an important task to restore its quality and t...

Please sign up or login with your details

Forgot password? Click here to reset