A Flow-Based Neural Network for Time Domain Speech Enhancement

06/16/2021
by   Martin Strauss, et al.
0

Speech enhancement involves the distinction of a target speech signal from an intrusive background. Although generative approaches using Variational Autoencoders or Generative Adversarial Networks (GANs) have increasingly been used in recent years, normalizing flow (NF) based systems are still scarse, despite their success in related fields. Thus, in this paper we propose a NF framework to directly model the enhancement process by density estimation of clean speech utterances conditioned on their noisy counterpart. The WaveGlow model from speech synthesis is adapted to enable direct enhancement of noisy utterances in time domain. In addition, we demonstrate that nonlinear input companding benefits the model performance by equalizing the distribution of input samples. Experimental evaluation on a publicly available dataset shows comparable results to current state-of-the-art GAN-based approaches, while surpassing the chosen baselines using objective evaluation metrics.

READ FULL TEXT
research
07/29/2020

On Loss Functions and Recurrency Training for GAN-based Speech Enhancement Systems

Recent work has shown that it is feasible to use generative adversarial ...
research
10/21/2022

Improved Normalizing Flow-Based Speech Enhancement using an All-pole Gammatone Filterbank for Conditional Input Representation

Deep generative models for Speech Enhancement (SE) received increasing a...
research
02/20/2020

iSEGAN: Improved Speech Enhancement Generative Adversarial Networks

Popular neural network-based speech enhancement systems operate on the m...
research
04/06/2019

Towards Generalized Speech Enhancement with Generative Adversarial Networks

The speech enhancement task usually consists of removing additive noise ...
research
09/06/2017

Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification

Improving speech system performance in noisy environments remains a chal...
research
04/02/2020

iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Learning

The intelligibility of natural speech is seriously degraded when exposed...
research
09/10/2019

Generative Speech Enhancement Based on Cloned Networks

We propose to implement speech enhancement by the regeneration of clean ...

Please sign up or login with your details

Forgot password? Click here to reset