Solving Audio Inverse Problems with a Diffusion Model

10/27/2022
by   Eloi Moliner, et al.
0

This paper presents CQT-Diff, a data-driven generative audio model that can, once trained, be used for solving various different audio inverse problems in a problem-agnostic setting. CQT-Diff is a neural diffusion model with an architecture that is carefully constructed to exploit pitch-equivariant symmetries in music. This is achieved by preconditioning the model with an invertible Constant-Q Transform (CQT), whose logarithmically-spaced frequency axis represents pitch equivariance as translation equivariance. The proposed method is evaluated with objective and subjective metrics in three different and varied tasks: audio bandwidth extension, inpainting, and declipping. The results show that CQT-Diff outperforms the compared baselines and ablations in audio bandwidth extension and, without retraining, delivers competitive performance against modern baselines in audio inpainting and declipping. This work represents the first diffusion-based general framework for solving inverse problems in audio processing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2023

Solving Linear Inverse Problems Provably via Posterior Sampling with Latent Diffusion Models

We present the first framework to solve linear inverse problems leveragi...
research
08/15/2023

Monte Carlo guided Diffusion for Bayesian linear inverse problems

Ill-posed linear inverse problems that combine knowledge of the forward ...
research
06/02/2023

Zero-Shot Blind Audio Bandwidth Extension

Audio bandwidth extension involves the realistic reconstruction of high-...
research
05/24/2023

Diffusion-Based Audio Inpainting

Audio inpainting aims to reconstruct missing segments in corrupted recor...
research
04/13/2022

BEHM-GAN: Bandwidth Extension of Historical Music using Generative Adversarial Networks

Audio bandwidth extension aims to expand the spectrum of narrow-band aud...
research
06/15/2010

Solving Inverse Problems with Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity

A general framework for solving image inverse problems is introduced in ...
research
06/01/2023

UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model

This paper introduces UnDiff, a diffusion probabilistic model capable of...

Please sign up or login with your details

Forgot password? Click here to reset