Alternating Between Spectral and Spatial Estimation for Speech Separation and Enhancement

11/18/2019
by   Zhong-Qiu Wang, et al.
0

This work investigates alternation between spectral separation using masking-based networks and spatial separation using multichannel beamforming. In this framework, the spectral separation is performed using a mask-based deep network. The result of mask-based separation is used, in turn, to estimate a spatial beamformer. The output of the beamformer is fed back into another mask-based separation network. We explore multiple ways of computing time-varying covariance matrices to improve beamforming, including factorizing the spatial covariance into a time-varying amplitude component and time-invariant spatial component. For the subsequent mask-based filtering, we consider different modes, including masking the noisy input, masking the beamformer output, and a hybrid approach combining both. Our best method first uses spectral separation, then spatial beamforming, and finally a spectral post-filter, and demonstrates an average improvement of 2.8 dB over baseline mask-based separation, across four different reverberant speech enhancement and separation tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2019

Multichannel Loss Function for Supervised Speech Source Separation by Mask-based Beamforming

In this paper, we propose two mask-based beamforming methods using a dee...
research
12/08/2021

NICE-Beam: Neural Integrated Covariance Estimators for Time-Varying Beamformers

Estimating a time-varying spatial covariance matrix for a beamforming al...
research
11/17/2020

Rethinking the Separation Layers in Speech Separation Networks

Modules in all existing speech separation networks can be categorized in...
research
03/04/2022

Integrating Statistical Uncertainty into Neural Network-Based Speech Enhancement

Speech enhancement in the time-frequency domain is often performed by es...
research
05/07/2022

Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking

Beamforming is a powerful tool designed to enhance speech signals from t...
research
05/08/2019

Universal Sound Separation

Recent deep learning approaches have achieved impressive performance on ...
research
10/04/2020

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speaker Separation

We propose multi-microphone complex spectral mapping, a simple way of ap...

Please sign up or login with your details

Forgot password? Click here to reset