A Study on Speech Enhancement Based on Diffusion Probabilistic Model

07/25/2021
by   Yen-Ju Lu, et al.
0

Diffusion probabilistic models have demonstrated an outstanding capability to model natural images and raw audio waveforms through a paired diffusion and reverse processes. The unique property of the reverse process (namely, eliminating non-target signals from the Gaussian noise and noisy signals) could be utilized to restore clean signals. Based on this property, we propose a diffusion probabilistic model-based speech enhancement (DiffuSE) model that aims to recover clean speech signals from noisy signals. The fundamental architecture of the proposed DiffuSE model is similar to that of DiffWave–a high-quality audio waveform generation model that has a relatively low computational cost and footprint. To attain better enhancement performance, we designed an advanced reverse process, termed the supportive reverse process, which adds noisy speech in each time-step to the predicted speech. The experimental results show that DiffuSE yields performance that is comparable to related audio generative models on the standardized Voice Bank corpus SE task. Moreover, relative to the generally suggested full sampling schedule, the proposed supportive reverse process especially improved the fast sampling, taking few steps to yield better enhancement results over the conventional full step inference process.

READ FULL TEXT
research
11/04/2022

Cold Diffusion for Speech Enhancement

Diffusion models have recently shown promising results for difficult enh...
research
02/10/2022

Conditional Diffusion Probabilistic Model for Speech Enhancement

Speech enhancement is a critical component of many user-oriented audio a...
research
03/31/2022

SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping

Neural vocoder using denoising diffusion probabilistic model (DDPM) has ...
research
09/18/2023

Single and Few-step Diffusion for Generative Speech Enhancement

Diffusion models have shown promising results in speech enhancement, usi...
research
09/03/2023

NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement

The goal of speech enhancement (SE) is to eliminate the background inter...
research
02/28/2023

Reducing the Prior Mismatch of Stochastic Differential Equations for Diffusion-based Speech Enhancement

Recently, score-based generative models have been successfully employed ...
research
08/11/2022

Speech Enhancement and Dereverberation with Diffusion-based Generative Models

Recently, diffusion-based generative models have been introduced to the ...

Please sign up or login with your details

Forgot password? Click here to reset