Monaural Speech Enhancement with Recursive Learning in the Time Domain

03/22/2020
by   Andong Li, et al.
0

In this paper, we propose a type of neural network with recursive learning in the time domain called RTNet for monaural speech enhancement, where the proposed network consists of three principal components. The first part is called stage recurrent neural network, which is proposed to effectively aggregate the deep feature dependencies across different stages with a memory mechanism and also remove the interference stageby-stage. The second part is the convolutional auto-encoder. The third part consists of a series of concatenated gated linear units, which are capable of facilitating the information flow and gradually increasing the receptive fields. Recursive learning is adopted to significantly improve the parameter efficiency and therefore, the number of trainable parameters is effectively reduced without sacrificing its performance. The experiments are conducted on TIMIT corpus. Experimental results demonstrate that the proposed network achieves consistently better performance in both PESQ and STOI scores than two advanced time domain-based baselines in different conditions. The code is provided at https://github.com/Andong-Li-speech/RTNet.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2020

A Time-domain Monaural Speech Enhancement with Recursive Learning

In this paper, we propose a type of neural network with recursive learni...
research
03/29/2020

A Recursive Network with Dynamic Attention for Monaural Speech Enhancement

A person tends to generate dynamic attention towards speech under compli...
research
06/22/2021

Learning to Inference with Early Exit in the Progressive Speech Enhancement

In real scenarios, it is often necessary and significant to control the ...
research
10/26/2022

Parallel Gated Neural Network With Attention Mechanism For Speech Enhancement

Deep learning algorithm are increasingly used for speech enhancement (SE...
research
06/20/2019

A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting

Robustness against noise is critical for keyword spotting (KWS) in real-...
research
08/18/2019

Efficient Context Aggregation for End-to-End Speech Enhancement Using a Densely Connected Convolutional and Recurrent Network

In speech enhancement, an end-to-end deep neural network converts a nois...
research
07/01/2020

Instantaneous PSD Estimation for Speech Enhancement based on Generalized Principal Components

Power spectral density (PSD) estimates of various microphone signal comp...

Please sign up or login with your details

Forgot password? Click here to reset