UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-noise Ratio Condition

10/29/2020
by   Xiang Hao, et al.
0

Speech enhancement at extremely low signal-to-noise ratio (SNR) condition is a very challenging problem and rarely investigated in previous works. This paper proposes a robust speech enhancement approach (UNetGAN) based on U-Net and generative adversarial learning to deal with this problem. This approach consists of a generator network and a discriminator network, which operate directly in the time domain. The generator network adopts a U-Net like structure and employs dilated convolution in the bottleneck of it. We evaluate the performance of the UNetGAN at low SNR conditions (up to -20dB) on the public benchmark. The result demonstrates that it significantly improves the speech quality and substantially outperforms the representative deep learning models, including SEGAN, cGAN fo SE, Bidirectional LSTM using phase-sensitive spectrum approximation cost function (PSA-BLSTM) and Wave-U-Net regarding Short-Time Objective Intelligibility (STOI) and Perceptual evaluation of speech quality (PESQ).

READ FULL TEXT
research
05/29/2020

SNR-based teachers-student technique for speech enhancement

It is very challenging for speech enhancement methods to achieves robust...
research
09/03/2023

Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement

Speech emotion recognition (SER) often experiences reduced performance d...
research
03/30/2021

Time-domain Speech Enhancement with Generative Adversarial Learning

Speech enhancement aims to obtain speech signals with high intelligibili...
research
03/04/2022

PercepNet+: A Phase and SNR Aware PercepNet for Real-Time Speech Enhancement

PercepNet, a recent extension of the RNNoise, an efficient, high-quality...
research
09/16/2021

DDS: A new device-degraded speech dataset for speech enhancement

A large and growing amount of speech content in real-life scenarios is b...
research
09/06/2017

Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification

Improving speech system performance in noisy environments remains a chal...
research
10/11/2017

PROSE: Perceptual Risk Optimization for Speech Enhancement

The goal in speech enhancement is to obtain an estimate of clean speech ...

Please sign up or login with your details

Forgot password? Click here to reset