Multichannel Loss Function for Supervised Speech Source Separation by Mask-based Beamforming

07/11/2019
by   Yoshiki Masuyama, et al.
0

In this paper, we propose two mask-based beamforming methods using a deep neural network (DNN) trained by multichannel loss functions. Beamforming technique using time-frequency (TF)-masks estimated by a DNN have been applied to many applications where TF-masks are used for estimating spatial covariance matrices. To train a DNN for mask-based beamforming, loss functions designed for monaural speech enhancement/separation have been employed. Although such a training criterion is simple, it does not directly correspond to the performance of mask-based beamforming. To overcome this problem, we use multichannel loss functions which evaluate the estimated spatial covariance matrices based on the multichannel Itakura–Saito divergence. DNNs trained by the multichannel loss functions can be applied to construct several beamformers. Experimental results confirmed their effectiveness and robustness to microphone configurations.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

research
07/22/2022

DNN-Free Low-Latency Adaptive Speech Enhancement Based on Frame-Online Beamforming Powered by Block-Online FastMNMF

This paper describes a practical dual-process speech enhancement system ...
research
11/18/2019

Alternating Between Spectral and Spatial Estimation for Speech Separation and Enhancement

This work investigates alternation between spectral separation using mas...
research
11/11/2019

Unsupervised Training for Deep Speech Source Separation with Kullback-Leibler Divergence Based Probabilistic Loss Function

In this paper, we propose a multi-channel speech source separation with ...
research
11/08/2021

Learning Filterbanks for End-to-End Acoustic Beamforming

Recent work on monaural source separation has shown that performance can...
research
05/07/2022

Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking

Beamforming is a powerful tool designed to enhance speech signals from t...
research
08/11/2021

On The Compensation Between Magnitude and Phase in Speech Separation

Deep neural network (DNN) based end-to-end optimization in the complex t...
research
12/08/2021

NICE-Beam: Neural Integrated Covariance Estimators for Time-Varying Beamformers

Estimating a time-varying spatial covariance matrix for a beamforming al...

Please sign up or login with your details

Forgot password? Click here to reset