Unsupervised Training for Deep Speech Source Separation with Kullback-Leibler Divergence Based Probabilistic Loss Function

11/11/2019
by   Masahito Togami, et al.
0

In this paper, we propose a multi-channel speech source separation with a deep neural network (DNN) which is trained under the condition that no clean signal is available. As an alternative to a clean signal, the proposed method adopts an estimated speech signal by an unsupervised speech source separation with a statistical model. As a statistical model of microphone input signal, we adopts a time-varying spatial covariance matrix (SCM) model which includes reverberation and background noise submodels so as to achieve robustness against reverberation and background noise. The DNN infers intermediate variables which are needed for constructing the time-varying SCM. Speech source separation is performed in a probabilistic manner so as to avoid overfitting to separation error. Since there are multiple intermediate variables, a loss function which evaluates a single intermediate variable is not applicable. Instead, the proposed method adopts a loss function which evaluates the output probabilistic signal directly based on Kullback-Leibler Divergence (KLD). Gradient of the loss function can be back-propagated into the DNN through all the intermediate variables. Experimental results under reverberant conditions show that the proposed method can train the DNN efficiently even when the number of training utterances is small, i.e., 1K.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2019

Multichannel Loss Function for Supervised Speech Source Separation by Mask-based Beamforming

In this paper, we propose two mask-based beamforming methods using a dee...
research
11/12/2013

Deep neural networks for single channel source separation

In this paper, a novel approach for single channel source separation (SC...
research
03/26/2022

Remix-cycle-consistent Learning on Adversarially Learned Separator for Accurate and Stable Unsupervised Speech Separation

A new learning algorithm for speech separation networks is designed to e...
research
04/12/2015

Deep Transform: Cocktail Party Source Separation via Complex Convolution in a Deep Neural Network

Convolutional deep neural networks (DNN) are state of the art in many en...
research
04/02/2019

Unsupervised training of a deep clustering model for multichannel blind source separation

We propose a training scheme to train neural network-based source separa...
research
12/08/2021

NICE-Beam: Neural Integrated Covariance Estimators for Time-Varying Beamformers

Estimating a time-varying spatial covariance matrix for a beamforming al...
research
07/31/2023

Deep Learning Meets Adaptive Filtering: A Stein's Unbiased Risk Estimator Approach

This paper revisits two prominent adaptive filtering algorithms through ...

Please sign up or login with your details

Forgot password? Click here to reset