Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation

10/29/2020
by   Sung-Feng Huang, et al.
0

Speech separation has been well-developed while there are still problems waiting to be solved. The main problem we focus on in this paper is the frequent label permutation switching of permutation invariant training (PIT). For N-speaker separation, there would be N! possible label permutations. How to stably select correct label permutations is a long-standing problem. In this paper, we utilize self-supervised pre-training to stabilize the label permutations. Among several types of self-supervised tasks, speech enhancement based pre-training tasks show significant effectiveness in our experiments. When using off-the-shelf pre-trained models, training duration could be shortened to one-third to two-thirds. Furthermore, even taking pre-training time into account, the entire training process could still be shorter without a performance drop when using a larger batch size.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2021

Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch

Recent research in speech processing exhibits a growing interest in unsu...
research
12/21/2021

Self-Supervised Learning based Monaural Speech Enhancement with Multi-Task Pre-Training

In self-supervised learning, it is challenging to reduce the gap between...
research
10/15/2021

Don't speak too fast: The impact of data bias on self-supervised speech models

Self-supervised Speech Models (S3Ms) have been proven successful in many...
research
04/14/2022

SNP2Vec: Scalable Self-Supervised Pre-Training for Genome-Wide Association Study

Self-supervised pre-training methods have brought remarkable breakthroug...
research
10/28/2019

Interrupted and cascaded permutation invariant training for speech separation

Permutation Invariant Training (PIT) has long been a stepping stone meth...
research
01/05/2022

Self-Supervised Beat Tracking in Musical Signals with Polyphonic Contrastive Learning

Annotating musical beats is a very long in tedious process. In order to ...
research
06/11/2023

Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute

Self-supervised learning (SSL) has led to great strides in speech proces...

Please sign up or login with your details

Forgot password? Click here to reset