Orthonormal Embedding-based Deep Clustering for Single-channel Speech Separation

01/15/2019
by   Soyeon Choe, et al.
0

Deep clustering is a deep neural network-based speech separation algorithm that first trains the mixed component of signals with high-dimensional embeddings, and then uses a clustering algorithm to separate each mixture of sources. In this paper, we extend the baseline criterion of deep clustering with an additional regularization term to further improve the overall performance. This term plays a role in assigning a condition to the embeddings such that it gives less correlation to each embedding dimension, leading to better decomposition of the spectral bins. The regularization term helps to mitigate the unavoidable permutation problem in the conventional deep clustering method, which enables to bring better clustering through the formation of optimal embeddings. We evaluate the results by varying embedding dimension, signal-to-interference ratio (SIR), and gender dependency. The performance comparison with the source separation measurement metric, i.e. signal-to-distortion ratio (SDR), confirms that the proposed method outperforms the conventional deep clustering method.

READ FULL TEXT
research
11/18/2016

Deep Clustering and Conventional Networks for Music Separation: Stronger Together

Deep clustering is the first method to handle general audio separation s...
research
08/07/2023

Improving Deep Attractor Network by BGRU and GMM for Speech Separation

Deep Attractor Network (DANet) is the state-of-the-art technique in spee...
research
06/24/2019

Single-Channel Speech Separation with Auxiliary Speaker Embeddings

We present a novel source separation model to decompose asingle-channel ...
research
07/07/2016

Single-Channel Multi-Speaker Separation using Deep Clustering

Deep clustering is a recently introduced deep learning architecture that...
research
02/05/2020

Spatial and spectral deep attention fusion for multi-channel speech separation using deep embedding features

Multi-channel deep clustering (MDC) has acquired a good performance for ...
research
04/02/2019

Unsupervised training of a deep clustering model for multichannel blind source separation

We propose a training scheme to train neural network-based source separa...
research
10/13/2021

One to Multiple Mapping Dual Learning: Learning Multiple Sources from One Mixed Signal

Single channel blind source separation (SCBSS) refers to separate multip...

Please sign up or login with your details

Forgot password? Click here to reset