MetricGAN-U: Unsupervised speech enhancement/ dereverberation based only on noisy/ reverberated speech

10/12/2021
by   Szu-Wei Fu, et al.
49

Most of the deep learning-based speech enhancement models are learned in a supervised manner, which implies that pairs of noisy and clean speech are required during training. Consequently, several noisy speeches recorded in daily life cannot be used to train the model. Although certain unsupervised learning frameworks have also been proposed to solve the pair constraint, they still require clean speech or noise for training. Therefore, in this paper, we propose MetricGAN-U, which stands for MetricGAN-unsupervised, to further release the constraint from conventional unsupervised learning. In MetricGAN-U, only noisy speech is required to train the model by optimizing non-intrusive speech quality metrics. The experimental results verified that MetricGAN-U outperforms baselines in both objective and subjective metrics.

READ FULL TEXT
research
10/27/2022

A Teacher-student Framework for Unsupervised Speech Enhancement Using Noise Remixing Training and Two-stage Inference

The lack of clean speech is a practical challenge to the development of ...
research
11/02/2022

Analysis of Noisy-target Training for DNN-based speech enhancement

Deep neural network (DNN)-based speech enhancement usually uses a clean ...
research
12/09/2021

A Training Framework for Stereo-Aware Speech Enhancement using Deep Neural Networks

Deep learning-based speech enhancement has shown unprecedented performan...
research
03/04/2022

MANNER: Multi-view Attention Network for Noise Erasure

In the field of speech enhancement, time domain methods have difficultie...
research
04/28/2020

Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise

Attention-based sequence-to-sequence (seq2seq) speech synthesis has achi...
research
01/06/2020

Speech Enhancement based on Denoising Autoencoder with Multi-branched Encoders

Deep learning-based models have greatly advanced the performance of spee...
research
11/11/2020

Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning

Recurrent neural networks (RNNs) have shown significant improvements in ...

Please sign up or login with your details

Forgot password? Click here to reset