Unsupervised training of a deep clustering model for multichannel blind source separation

04/02/2019
by   Lukas Drude, et al.
0

We propose a training scheme to train neural network-based source separation algorithms from scratch when parallel clean data is unavailable. In particular, we demonstrate that an unsupervised spatial clustering algorithm is sufficient to guide the training of a deep clustering system. We argue that previous work on deep clustering requires strong supervision and elaborate on why this is a limitation. We demonstrate that (a) the single-channel deep clustering system trained according to the proposed scheme alone is able to achieve a similar performance as the multi-channel teacher in terms of word error rates and (b) initializing the spatial clustering approach with the deep clustering result yields a relative word error rate reduction of 26 teacher.

READ FULL TEXT
research
11/05/2018

Unsupervised Deep Clustering for Source Separation: Direct Learning from Mixtures using Spatial Information

We present a monophonic source separation system that is trained by only...
research
12/02/2020

Improved MVDR Beamforming Using LSTM Speech Models to Clean Spatial Clustering Masks

Spatial clustering techniques can achieve significant multi-channel nois...
research
12/02/2020

Combining Spatial Clustering with LSTM Speech Models for Multichannel Speech Enhancement

Recurrent neural networks using the LSTM architecture can achieve signif...
research
11/11/2019

Unsupervised Training for Deep Speech Source Separation with Kullback-Leibler Divergence Based Probabilistic Loss Function

In this paper, we propose a multi-channel speech source separation with ...
research
09/01/2023

Remixing-based Unsupervised Source Separation from Scratch

We propose an unsupervised approach for training separation models from ...
research
11/15/2021

Monaural source separation: From anechoic to reverberant environments

Impressive progress in neural network-based single-channel speech source...
research
01/15/2019

Orthonormal Embedding-based Deep Clustering for Single-channel Speech Separation

Deep clustering is a deep neural network-based speech separation algorit...

Please sign up or login with your details

Forgot password? Click here to reset