DESNet: A Multi-channel Network for Simultaneous Speech Dereverberation, Enhancement and Separation

11/04/2020
by   Yihui Fu, et al.
0

In this paper, we propose a multi-channel network for simultaneous speech dereverberation, enhancement and separation (DESNet). To enable gradient propagation and joint optimization, we adopt the attentional selection mechanism of the multi-channel features, which is originally proposed in end-to-end unmixing, fixed-beamforming and extraction (E2E-UFE) structure. Furthermore, the novel deep complex convolutional recurrent network (DCCRN) is used as the structure of the speech unmixing and the neural network based weighted prediction error (WPE) is cascaded beforehand for speech dereverberation. We also introduce the staged SNR strategy and symphonic loss for the training of the network to further improve the final performance. Experiments show that in non-dereverberated case, the proposed DESNet outperforms DCCRN and most state-of-the-art structures in speech enhancement and separation, while in dereverberated scenario, DESNet also shows improvements over the cascaded WPE-DCCRN networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2023

Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition

Accurate recognition of cocktail party speech containing overlapping spe...
research
11/08/2021

Inter-channel Conv-TasNet for multichannel speech enhancement

Speech enhancement in multichannel settings has been realized by utilizi...
research
10/23/2020

Speech enhancement aided end-to-end multi-task learning for voice activity detection

Robust voice activity detection (VAD) is a challenging task in low signa...
research
10/17/2022

spatial-dccrn: dccrn equipped with frame-level angle feature and hybrid filtering for multi-channel speech enhancement

Recently, multi-channel speech enhancement has drawn much interest due t...
research
05/03/2022

Improving Dual-Microphone Speech Enhancement by Learning Cross-Channel Features with Multi-Head Attention

Hand-crafted spatial features, such as inter-channel intensity differenc...
research
05/08/2020

Neural Spatio-Temporal Beamformer for Target Speech Separation

Purely neural network (NN) based speech separation and enhancement metho...
research
11/03/2019

Onssen: an open-source speech separation and enhancement library

Speech separation is an essential task for multi-talker speech recogniti...

Please sign up or login with your details

Forgot password? Click here to reset