Teacher-Student MixIT for Unsupervised and Semi-supervised Speech Separation

06/15/2021
by   Jisi Zhang, et al.
0

In this paper, we introduce a novel semi-supervised learning framework for end-to-end speech separation. The proposed method first uses mixtures of unseparated sources and the mixture invariant training (MixIT) criterion to train a teacher model. The teacher model then estimates separated sources that are used to train a student model with standard permutation invariant training (PIT). The student model can be fine-tuned with supervised data, i.e., paired artificial mixtures and clean speech sources, and further improved via model distillation. Experiments with single and multi channel mixtures show that the teacher-student training resolves the over-separation problem observed in the original MixIT method. Further, the semisupervised performance is comparable to a fully-supervised separation system trained using ten times the amount of supervised data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/19/2021

Continual self-training with bootstrapped remixing for speech enhancement

We propose RemixIT, a simple and novel self-supervised training method f...
research
02/08/2022

Unsupervised Source Separation via Self-Supervised Training

We introduce two novel unsupervised (blind) source separation methods, w...
research
04/07/2022

Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation

Existing multi-channel continuous speech separation (CSS) models are hea...
research
09/01/2023

Remixing-based Unsupervised Source Separation from Scratch

We propose an unsupervised approach for training separation models from ...
research
11/15/2022

Reverberation as Supervision for Speech Separation

This paper proposes reverberation as supervision (RAS), a novel unsuperv...
research
04/27/2022

Ultra Fast Speech Separation Model with Teacher Student Learning

Transformer has been successfully applied to speech separation recently ...
research
10/28/2019

Mixup-breakdown: a consistency training method for improving generalization of speech separation models

Deep-learning based speech separation models confront poor generalizatio...

Please sign up or login with your details

Forgot password? Click here to reset