On the Behavior of Intrusive and Non-intrusive Speech Enhancement Metrics in Predictive and Generative Settings

06/05/2023
by   Danilo de Oliveira, et al.
0

Since its inception, the field of deep speech enhancement has been dominated by predictive (discriminative) approaches, such as spectral mapping or masking. Recently, however, novel generative approaches have been applied to speech enhancement, attaining good denoising performance with high subjective quality scores. At the same time, advances in deep learning also allowed for the creation of neural network-based metrics, which have desirable traits such as being able to work without a reference (non-intrusively). Since generatively enhanced speech tends to exhibit radically different residual distortions, its evaluation using instrumental speech metrics may behave differently compared to predictively enhanced speech. In this paper, we evaluate the performance of the same speech enhancement backbone trained under predictive and generative paradigms on a variety of metrics and show that intrusive and non-intrusive measures correlate differently for each paradigm. This analysis motivates the search for metrics that can together paint a complete and unbiased picture of speech enhancement performance, irrespective of the model's training process.

READ FULL TEXT
research
01/26/2022

A two-step backward compatible fullband speech enhancement system

Speech enhancement methods based on deep learning have surpassed traditi...
research
12/09/2021

A Training Framework for Stereo-Aware Speech Enhancement using Deep Neural Networks

Deep learning-based speech enhancement has shown unprecedented performan...
research
10/02/2021

Processing Phoneme Specific Segments for Cleft Lip and Palate Speech Enhancement

The cleft lip and palate (CLP) speech intelligibility is distorted due t...
research
05/25/2021

RNNoise-Ex: Hybrid Speech Enhancement System based on RNN and Spectral Features

Recent interest in exploiting Deep Learning techniques for Noise Suppres...
research
03/31/2022

Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening

It is essential to perform speech intelligibility (SI) experiments with ...
research
01/03/2019

Deep Speech Enhancement for Reverberated and Noisy Signals using Wide Residual Networks

This paper proposes a deep speech enhancement method which exploits the ...
research
04/09/2019

Speech Enhancement with Wide Residual Networks in Reverberant Environments

This paper proposes a speech enhancement method which exploits the high ...

Please sign up or login with your details

Forgot password? Click here to reset