Improved Normalizing Flow-Based Speech Enhancement using an All-pole Gammatone Filterbank for Conditional Input Representation

10/21/2022
by   Martin Strauss, et al.
0

Deep generative models for Speech Enhancement (SE) received increasing attention in recent years. The most prominent example are Generative Adversarial Networks (GANs), while normalizing flows (NF) received less attention despite their potential. Building on previous work, architectural modifications are proposed, along with an investigation of different conditional input representations. Despite being a common choice in related works, Mel-spectrograms demonstrate to be inadequate for the given scenario. Alternatively, a novel All-Pole Gammatone filterbank (APG) with high temporal resolution is proposed. Although computational evaluation metric results would suggest that state-of-the-art GAN-based methods perform best, a perceptual evaluation via a listening test indicates that the presented NF approach (based on time domain and APG) performs best, especially at lower SNRs. On average, APG outputs are rated as having good quality, which is unmatched by the other methods, including GAN.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2021

A Flow-Based Neural Network for Time Domain Speech Enhancement

Speech enhancement involves the distinction of a target speech signal fr...
research
10/26/2022

SCP-GAN: Self-Correcting Discriminator Optimization for Training Consistency Preserving Metric GAN on Speech Enhancement Tasks

In recent years, Generative Adversarial Networks (GANs) have produced si...
research
07/29/2020

On Loss Functions and Recurrency Training for GAN-based Speech Enhancement Systems

Recent work has shown that it is feasible to use generative adversarial ...
research
06/13/2020

Dynamic Attention Based Generative Adversarial Network with Phase Post-Processing for Speech Enhancement

The generative adversarial networks (GANs) have facilitated the developm...
research
02/11/2023

Attention does not guarantee best performance in speech enhancement

Attention mechanism has been widely utilized in speech enhancement (SE) ...
research
01/15/2020

Improving GANs for Speech Enhancement

Generative adversarial networks (GAN) have recently been shown to be eff...
research
03/24/2022

HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement

Generative adversarial networks have recently demonstrated outstanding p...

Please sign up or login with your details

Forgot password? Click here to reset