Analysing Diffusion-based Generative Approaches versus Discriminative Approaches for Speech Restoration

11/04/2022
by   Jean-Marie Lemercier, et al.
0

Diffusion-based generative models have had a high impact on the computer vision and speech processing communities these past years. Besides data generation tasks, they have also been employed for data restoration tasks like speech enhancement and dereverberation. While discriminative models have traditionally been argued to be more powerful e.g. for speech enhancement, generative diffusion approaches have recently been shown to narrow this performance gap considerably. In this paper, we systematically compare the performance of generative diffusion models and discriminative approaches on different speech restoration tasks. For this, we extend our prior contributions on diffusion-based speech enhancement in the complex time-frequency domain to the task of bandwith extension. We then compare it to a discriminatively trained neural network with the same network architecture on three restoration tasks, namely speech denoising, dereverberation and bandwidth extension. We observe that the generative approach performs globally better than its discriminative counterpart on all tasks, with the strongest benefit for non-additive distortion models, like in dereverberation and bandwidth extension. Code and audio examples can be found online at https://uhh.de/inf-sp-sgmsemultitask

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2022

Cold Diffusion for Speech Enhancement

Diffusion models have recently shown promising results for difficult enh...
research
09/13/2023

VRDMG: Vocal Restoration via Diffusion Posterior Sampling with Multiple Guidance

Restoring degraded music signals is essential to enhance audio quality f...
research
03/24/2022

HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement

Generative adversarial networks have recently demonstrated outstanding p...
research
12/22/2022

StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation

Diffusion models have shown a great ability at bridging the performance ...
research
06/07/2022

Universal Speech Enhancement with Score-based Diffusion

Removing background noise from speech audio has been the subject of cons...
research
06/01/2023

UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model

This paper introduces UnDiff, a diffusion probabilistic model capable of...
research
11/08/2022

DiffPhase: Generative Diffusion-based STFT Phase Retrieval

Diffusion probabilistic models have been recently used in a variety of t...

Please sign up or login with your details

Forgot password? Click here to reset