Conditional Diffusion Probabilistic Model for Speech Enhancement

02/10/2022
by   Yen-Ju Lu, et al.
0

Speech enhancement is a critical component of many user-oriented audio applications, yet current systems still suffer from distorted and unnatural outputs. While generative models have shown strong potential in speech synthesis, they are still lagging behind in speech enhancement. This work leverages recent advances in diffusion probabilistic models, and proposes a novel speech enhancement algorithm that incorporates characteristics of the observed noisy speech signal into the diffusion and reverse processes. More specifically, we propose a generalized formulation of the diffusion probabilistic model named conditional diffusion probabilistic model that, in its reverse process, can adapt to non-Gaussian real noises in the estimated speech signal. In our experiments, we demonstrate strong performance of the proposed approach compared to representative generative models, and investigate the generalization capability of our models to other datasets with noise characteristics unseen during training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/25/2021

A Study on Speech Enhancement Based on Diffusion Probabilistic Model

Diffusion probabilistic models have demonstrated an outstanding capabili...
research
11/08/2022

DiffPhase: Generative Diffusion-based STFT Phase Retrieval

Diffusion probabilistic models have been recently used in a variety of t...
research
03/23/2023

A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI

Generative AI has demonstrated impressive performance in various fields,...
research
09/03/2023

NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement

The goal of speech enhancement (SE) is to eliminate the background inter...
research
12/22/2022

StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation

Diffusion models have shown a great ability at bridging the performance ...
research
10/30/2022

SRTNet: Time Domain Speech Enhancement Via Stochastic Refinement

Diffusion model, as a new generative model which is very popular in imag...
research
02/20/2023

DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises

While diffusion models have achieved great success in generating continu...

Please sign up or login with your details

Forgot password? Click here to reset