DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement

08/01/2020
by   Yanxin Hu, et al.
0

Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality. Conventional time-frequency (TF) domain methods focus on predicting TF-masks or speech spectrum, via a naive convolution neural network (CNN) or recurrent neural network (RNN). Some recent studies use complex-valued spectrogram as a training target but train in a real-valued network, predicting the magnitude and phase component or real and imaginary part, respectively. Particularly, convolution recurrent network (CRN) integrates a convolutional encoder-decoder (CED) structure and long short-term memory (LSTM), which has been proven to be helpful for complex targets. In order to train the complex target more effectively, in this paper, we design a new network structure simulating the complex-valued operation, called Deep Complex Convolution Recurrent Network (DCCRN), where both CNN and RNN structures can handle complex-valued operation. The proposed DCCRN models are very competitive over other previous networks, either on objective or subjective metric. With only 3.7M parameters, our DCCRN models submitted to the Interspeech 2020 Deep Noise Suppression (DNS) challenge ranked first for the real-time-track and second for the non-real-time track in terms of Mean Opinion Score (MOS).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/12/2021

Complex Spectral Mapping With Attention Based Convolution Recurrent Neural Network for Speech Enhancement

Speech enhancement has benefited from the success of deep learning in te...
research
02/09/2021

Real-time Monaural Speech Enhancement With Short-time Discrete Cosine Transform

Speech enhancement algorithms based on deep learning have been improved ...
research
10/27/2020

Phase Aware Speech Enhancement using Realisation of Complex-valued LSTM

Most of the deep learning based speech enhancement (SE) methods rely on ...
research
01/11/2023

Rethinking complex-valued deep neural networks for monaural speech enhancement

Despite multiple efforts made towards adopting complex-valued deep neura...
research
07/12/2021

DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement

The dual-path RNN (DPRNN) was proposed to more effectively model extreme...
research
11/20/2018

Differentiable Consistency Constraints for Improved Deep Speech Enhancement

In recent years, deep networks have led to dramatic improvements in spee...
research
06/24/2021

A Simultaneous Denoising and Dereverberation Framework with Target Decoupling

Background noise and room reverberation are regarded as two major factor...

Please sign up or login with your details

Forgot password? Click here to reset