Time-domain Speech Enhancement with Generative Adversarial Learning

03/30/2021
by   Feiyang Xiao, et al.
0

Speech enhancement aims to obtain speech signals with high intelligibility and quality from noisy speech. Recent work has demonstrated the excellent performance of time-domain deep learning methods, such as Conv-TasNet. However, these methods can be degraded by the arbitrary scales of the waveform induced by the scale-invariant signal-to-noise ratio (SI-SNR) loss. This paper proposes a new framework called Time-domain Speech Enhancement Generative Adversarial Network (TSEGAN), which is an extension of the generative adversarial network (GAN) in time-domain with metric evaluation to mitigate the scaling problem, and provide model training stability, thus achieving performance improvement. In addition, we provide a new method based on objective function mapping for the theoretical analysis of the performance of Metric GAN, and explain why it is better than the Wasserstein GAN. Experiments conducted demonstrate the effectiveness of our proposed method, and illustrate the advantage of Metric GAN.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/04/2021

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Speech enhancement is an essential task of improving speech quality in n...
research
07/27/2020

On the Use of Audio Fingerprinting Features for Speech Enhancement with Generative Adversarial Network

The advent of learning-based methods in speech enhancement has revived t...
research
11/10/2019

Transformation of low-quality device-recorded speech to high-quality speech using improved SEGAN model

Nowadays vast amounts of speech data are recorded from low-quality recor...
research
08/21/2019

Coarse-to-fine Optimization for Speech Enhancement

In this paper, we propose the coarse-to-fine optimization for the task o...
research
10/29/2020

UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-noise Ratio Condition

Speech enhancement at extremely low signal-to-noise ratio (SNR) conditio...
research
08/26/2021

A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement

Deep learning technology has been widely applied to speech enhancement. ...
research
04/06/2019

Towards Generalized Speech Enhancement with Generative Adversarial Networks

The speech enhancement task usually consists of removing additive noise ...

Please sign up or login with your details

Forgot password? Click here to reset