RemixIT: Continual self-training of speech enhancement models via bootstrapped remixing

02/17/2022
by   Efthymios Tzinis, et al.
8

We present RemixIT, a simple yet effective self-supervised method for training speech enhancement without the need of a single isolated in-domain speech nor a noise waveform. Our approach overcomes limitations of previous methods which make them dependent on clean in-domain target signals and thus, sensitive to any domain mismatch between train and test samples. RemixIT is based on a continuous self-training scheme in which a pre-trained teacher model on out-of-domain data infers estimated pseudo-target signals for in-domain mixtures. Then, by permuting the estimated clean and noise signals and remixing them together, we generate a new set of bootstrapped mixtures and corresponding pseudo-targets which are used to train the student network. Vice-versa, the teacher periodically refines its estimates using the updated parameters of the latest student models. Experimental results on multiple speech enhancement datasets and tasks not only show the superiority of our method over prior approaches but also showcase that RemixIT can be combined with any separation model as well as be applied towards any semi-supervised and unsupervised domain adaptation task. Our analysis, paired with empirical evidence, sheds light on the inside functioning of our self-training scheme wherein the student model keeps obtaining better performance while observing severely degraded pseudo-targets.

READ FULL TEXT
research
10/19/2021

Continual self-training with bootstrapped remixing for speech enhancement

We propose RemixIT, a simple and novel self-supervised training method f...
research
10/27/2022

A Teacher-student Framework for Unsupervised Speech Enhancement Using Noise Remixing Training and Two-stage Inference

The lack of clean speech is a practical challenge to the development of ...
research
11/18/2022

Self-Remixing: Unsupervised Speech Separation via Separation and Remixing

We present Self-Remixing, a novel self-supervised speech separation meth...
research
12/21/2021

Self-Supervised Learning based Monaural Speech Enhancement with Multi-Task Pre-Training

In self-supervised learning, it is challenging to reduce the gap between...
research
09/01/2023

Remixing-based Unsupervised Source Separation from Scratch

We propose an unsupervised approach for training separation models from ...
research
04/05/2021

Personalized Speech Enhancement through Self-Supervised Data Augmentation and Purification

Training personalized speech enhancement models is innately a no-shot le...
research
06/18/2022

NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling

For deep learning-based speech enhancement (SE) systems, the training-te...

Please sign up or login with your details

Forgot password? Click here to reset