Speeding Up Permutation Invariant Training for Source Separation

07/30/2021
by   Thilo von Neumann, et al.
0

Permutation invariant training (PIT) is a widely used training criterion for neural network-based source separation, used for both utterance-level separation with utterance-level PIT (uPIT) and separation of long recordings with the recently proposed Graph-PIT. When implemented naively, both suffer from an exponential complexity in the number of utterances to separate, rendering them unusable for large numbers of speakers or long realistic recordings. We present a decomposition of the PIT criterion into the computation of a matrix and a strictly monotonously increasing function so that the permutation or assignment problem can be solved efficiently with several search algorithms. The Hungarian algorithm can be used for uPIT and we introduce various algorithms for the Graph-PIT assignment problem to reduce the complexity to be polynomial in the number of utterances.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/09/2021

On permutation invariant training for speech source separation

We study permutation invariant training (PIT), which targets at the perm...
research
07/30/2021

Graph-PIT: Generalized permutation invariant training for continuous separation of arbitrary numbers of speakers

Automatic transcription of meetings requires handling of overlapped spee...
research
08/14/2017

Convolutive Audio Source Separation using Robust ICA and an intelligent evolving permutation ambiguity solution

Audio source separation is the task of isolating sound sources that are ...
research
03/27/2020

Separating Varying Numbers of Sources with Auxiliary Autoencoding Loss

Many recent source separation systems are designed to separate a fixed n...
research
11/11/2022

Optimal Condition Training for Target Source Separation

Recent research has shown remarkable performance in leveraging multiple ...
research
06/24/2020

Multi-path RNN for hierarchical modeling of long sequential data and its application to speaker stream separation

Recently, the source separation performance was greatly improved by time...

Please sign up or login with your details

Forgot password? Click here to reset