On Loss Functions and Recurrency Training for GAN-based Speech Enhancement Systems

07/29/2020
by   Zhuohuang Zhang, et al.
0

Recent work has shown that it is feasible to use generative adversarial networks (GANs) for speech enhancement, however, these approaches have not been compared to state-of-the-art (SOTA) non GAN-based approaches. Additionally, many loss functions have been proposed for GAN-based approaches, but they have not been adequately compared. In this study, we propose novel convolutional recurrent GAN (CRGAN) architectures for speech enhancement. Multiple loss functions are adopted to enable direct comparisons to other GAN-based systems. The benefits of including recurrent layers are also explored. Our results show that the proposed CRGAN model outperforms the SOTA GAN-based models using the same loss functions and it outperforms other non-GAN based systems, indicating the benefits of using a GAN for speech enhancement. Overall, the CRGAN model that combines an objective metric loss function with the mean squared error (MSE) provides the best performance over comparison approaches across many evaluation metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2021

A Flow-Based Neural Network for Time Domain Speech Enhancement

Speech enhancement involves the distinction of a target speech signal fr...
research
10/26/2022

SCP-GAN: Self-Correcting Discriminator Optimization for Training Consistency Preserving Metric GAN on Speech Enhancement Tasks

In recent years, Generative Adversarial Networks (GANs) have produced si...
research
09/03/2019

On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement

Many deep learning-based speech enhancement algorithms are designed to m...
research
05/13/2019

MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement

Adversarial loss in a conditional generative adversarial network (GAN) i...
research
10/21/2022

Improved Normalizing Flow-Based Speech Enhancement using an All-pole Gammatone Filterbank for Conditional Input Representation

Deep generative models for Speech Enhancement (SE) received increasing a...
research
01/15/2020

Improving GANs for Speech Enhancement

Generative adversarial networks (GAN) have recently been shown to be eff...
research
12/08/2019

A Supervised Speech enhancement Approach with Residual Noise Control for Voice Communication

For voice communication, it is important to extract the speech from its ...

Please sign up or login with your details

Forgot password? Click here to reset