DCCRN+: Channel-wise Subband DCCRN with SNR Estimation for Speech Enhancement

06/16/2021
by   Shubo Lv, et al.
0

Deep complex convolution recurrent network (DCCRN), which extends CRN with complex structure, has achieved superior performance in MOS evaluation in Interspeech 2020 deep noise suppression challenge (DNS2020). This paper further extends DCCRN with the following significant revisions. We first extend the model to sub-band processing where the bands are split and merged by learnable neural network filters instead of engineered FIR filters, leading to a faster noise suppressor trained in an end-to-end manner. Then the LSTM is further substituted with a complex TF-LSTM to better model temporal dependencies along both time and frequency axes. Moreover, instead of simply concatenating the output of each encoder layer to the input of the corresponding decoder layer, we use convolution blocks to first aggregate essential information from the encoder output before feeding it to the decoder layers. We specifically formulate the decoder with an extra a priori SNR estimation module to maintain good speech quality while removing noise. Finally a post-processing module is adopted to further suppress the unnatural residual noise. The new model, named DCCRN+, has surpassed the original DCCRN as well as several competitive models in terms of PESQ and DNSMOS, and has achieved superior performance in the new Interspeech 2021 DNS challenge

READ FULL TEXT
research
11/16/2021

S-DCCRN: Super Wide Band DCCRN with learnable complex feature for speech enhancement

In speech enhancement, complex neural network has shown promising perfor...
research
03/23/2022

FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement

Previously proposed FullSubNet has achieved outstanding performance in D...
research
03/04/2022

PercepNet+: A Phase and SNR Aware PercepNet for Real-Time Speech Enhancement

PercepNet, a recent extension of the RNNoise, an efficient, high-quality...
research
05/03/2022

Improving Dual-Microphone Speech Enhancement by Learning Cross-Channel Features with Multi-Head Attention

Hand-crafted spatial features, such as inter-channel intensity differenc...
research
06/15/2022

FRCRN: Boosting Feature Representation using Frequency Recurrence for Monaural Speech Enhancement

Convolutional recurrent networks (CRN) integrating a convolutional encod...
research
06/18/2020

Boosting Objective Scores of Speech Enhancement Model through MetricGAN Post-Processing

The Transformer architecture has shown its superior ability than recurre...
research
07/26/2023

Exploring the Interactions between Target Positive and Negative Information for Acoustic Echo Cancellation

Acoustic echo cancellation (AEC) aims to remove interference signals whi...

Please sign up or login with your details

Forgot password? Click here to reset