Convolutive Prediction for Reverberant Speech Separation

08/16/2021
by   Zhong-Qiu Wang, et al.
0

We investigate the effectiveness of convolutive prediction, a novel formulation of linear prediction for speech dereverberation, for speaker separation in reverberant conditions. The key idea is to first use a deep neural network (DNN) to estimate the direct-path signal of each speaker, and then identify delayed and decayed copies of the estimated direct-path signal. Such copies are likely due to reverberation, and can be directly removed for dereverberation or used as extra features for another DNN to perform better dereverberation and separation. To identify such copies, we solve a linear regression problem per frequency efficiently in the time-frequency (T-F) domain to estimate the underlying room impulse response (RIR). In the multi-channel extension, we perform minimum variance distortionless response (MVDR) beamforming on the outputs of convolutive prediction. The beamforming and dereverberation results are used as extra features for a second DNN to perform better separation and dereverberation. State-of-the-art results are obtained on the SMS-WSJ corpus.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/16/2021

Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation

A promising approach for speech dereverberation is based on supervised l...
research
11/22/2022

TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation

We propose TF-GridNet for speech separation. The model is a novel multi-...
research
05/23/2020

Exploring Optimal DNN Architecture for End-to-End Beamformers Based on Time-frequency References

Acoustic beamformers have been widely used to enhance audio signals. Cur...
research
09/08/2022

TF-GridNet: Making Time-Frequency Domain Models Great Again for Monaural Speaker Separation

We propose TF-GridNet, a novel multi-path deep neural network (DNN) oper...
research
04/19/2022

Single-Channel Speech Dereverberation using Subband Network with A Reverberation Time Shortening Target

This work proposes a subband network for single-channel speech dereverbe...
research
10/04/2020

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speaker Separation

We propose multi-microphone complex spectral mapping, a simple way of ap...
research
10/01/2021

Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement

A promising approach for multi-microphone speech separation involves two...

Please sign up or login with your details

Forgot password? Click here to reset