MetricGAN-U: Unsupervised speech enhancement/ dereverberation based only on noisy/ reverberated speech

by   Szu-Wei Fu, et al.

Most of the deep learning-based speech enhancement models are learned in a supervised manner, which implies that pairs of noisy and clean speech are required during training. Consequently, several noisy speeches recorded in daily life cannot be used to train the model. Although certain unsupervised learning frameworks have also been proposed to solve the pair constraint, they still require clean speech or noise for training. Therefore, in this paper, we propose MetricGAN-U, which stands for MetricGAN-unsupervised, to further release the constraint from conventional unsupervised learning. In MetricGAN-U, only noisy speech is required to train the model by optimizing non-intrusive speech quality metrics. The experimental results verified that MetricGAN-U outperforms baselines in both objective and subjective metrics.


Unsupervised Speech Enhancement with speech recognition embedding and disentanglement losses

Speech enhancement has recently achieved great success with various deep...

Self-supervised Learning for Speech Enhancement

Supervised learning for single-channel speech enhancement requires caref...

A Training Framework for Stereo-Aware Speech Enhancement using Deep Neural Networks

Deep learning-based speech enhancement has shown unprecedented performan...

MANNER: Multi-view Attention Network for Noise Erasure

In the field of speech enhancement, time domain methods have difficultie...

Speech Enhancement based on Denoising Autoencoder with Multi-branched Encoders

Deep learning-based models have greatly advanced the performance of spee...

Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise

Attention-based sequence-to-sequence (seq2seq) speech synthesis has achi...

Complex spectrogram enhancement by convolutional neural network with multi-metrics learning

This paper aims to address two issues existing in the current speech enh...