Single and Few-step Diffusion for Generative Speech Enhancement

09/18/2023
by   Bunlong Lay, et al.
0

Diffusion models have shown promising results in speech enhancement, using a task-adapted diffusion process for the conditional generation of clean speech given a noisy mixture. However, at test time, the neural network used for score estimation is called multiple times to solve the iterative reverse process. This results in a slow inference process and causes discretization errors that accumulate over the sampling trajectory. In this paper, we address these limitations through a two-stage training approach. In the first stage, we train the diffusion model the usual way using the generative denoising score matching loss. In the second stage, we compute the enhanced signal by solving the reverse process and compare the resulting estimate to the clean speech target using a predictive loss. We show that using this second training stage enables achieving the same performance as the baseline model using only 5 function evaluations instead of 60 function evaluations. While the performance of usual generative diffusion algorithms drops dramatically when lowering the number of function evaluations (NFEs) to obtain single-step diffusion, we show that our proposed method keeps a steady performance and therefore largely outperforms the diffusion baseline in this setting and also generalizes better than its predictive counterpart.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/11/2022

Speech Enhancement and Dereverberation with Diffusion-based Generative Models

Recently, diffusion-based generative models have been introduced to the ...
research
07/25/2021

A Study on Speech Enhancement Based on Diffusion Probabilistic Model

Diffusion probabilistic models have demonstrated an outstanding capabili...
research
08/08/2023

Target Speech Extraction with Conditional Diffusion Model

Diffusion model-based speech enhancement has received increased attentio...
research
02/28/2023

Reducing the Prior Mismatch of Stochastic Differential Equations for Diffusion-based Speech Enhancement

Recently, score-based generative models have been successfully employed ...
research
06/21/2023

Diffusion Posterior Sampling for Informed Single-Channel Dereverberation

We present in this paper an informed single-channel dereverberation meth...
research
10/31/2022

Diffusion-based Generative Speech Source Separation

We propose DiffSep, a new single channel source separation method based ...
research
12/22/2022

StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation

Diffusion models have shown a great ability at bridging the performance ...

Please sign up or login with your details

Forgot password? Click here to reset