DeepAI AI Chat
Log In Sign Up

Many-Speakers Single Channel Speech Separation with Optimal Permutation Training

by   Shaked Dovrat, et al.

Single channel speech separation has experienced great progress in the last few years. However, training neural speech separation for a large number of speakers (e.g., more than 10 speakers) is out of reach for the current methods, which rely on the Permutation Invariant Loss (PIT). In this work, we present a permutation invariant training that employs the Hungarian algorithm in order to train with an O(C^3) time complexity, where C is the number of speakers, in comparison to O(C!) of PIT based methods. Furthermore, we present a modified architecture that can handle the increased number of speakers. Our approach separates up to 20 speakers and improves the previous results for large C by a wide margin.


Recursive speech separation for unknown number of speakers

In this paper we propose a method of single-channel speaker-independent ...

SepIt: Approaching a Single Channel Speech Separation Bound

We present an upper bound for the Single Channel Speech Separation task,...

Multi-accent Speech Separation with One Shot Learning

Speech separation is a problem in the field of speech processing that ha...

Location-based training for multi-channel talker-independent speaker separation

Permutation-invariant training (PIT) is a dominant approach for addressi...

Single channel voice separation for unknown number of speakers under reverberant and noisy settings

We present a unified network for voice separation of an unknown number o...

Surrogate Source Model Learning for Determined Source Separation

We propose to learn surrogate functions of universal speech priors for d...

Separating Varying Numbers of Sources with Auxiliary Autoencoding Loss

Many recent source separation systems are designed to separate a fixed n...