Conditioning and Sampling in Variational Diffusion Models for Speech Super-resolution

10/27/2022
by   Chin-Yun Yu, et al.
0

Recently, diffusion models (DMs) have been increasingly used in audio processing tasks, including speech super-resolution (SR), which aims to restore high-frequency content given low-resolution speech utterances. This is commonly achieved by conditioning the network of noise predictor with low-resolution audio. In this paper, we propose a novel sampling algorithm that communicates the information of the low-resolution audio via the reverse sampling process of DMs. The proposed method can be a drop-in replacement for the vanilla sampling process and can significantly improve the performance of the existing works. Moreover, by coupling the proposed sampling method with an unconditional DM, i.e., a DM with no auxiliary inputs to its noise predictor, we can generalize it to a wide range of SR setups. We also attain state-of-the-art results on the VCTK Multi-Speaker benchmark with this novel formulation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/13/2023

AudioSR: Versatile Audio Super-resolution at Scale

Audio super-resolution is a fundamental task that predicts high-frequenc...
research
03/28/2022

Neural Vocoder is All You Need for Speech Super-resolution

Speech super-resolution (SR) is a task to increase speech sampling rate ...
research
07/23/2023

ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting

Diffusion-based image super-resolution (SR) methods are mainly limited b...
research
03/29/2023

Implicit Diffusion Models for Continuous Super-Resolution

Image super-resolution (SR) has attracted increasing attention due to it...
research
08/02/2017

Audio Super Resolution using Neural Networks

We introduce a new audio processing technique that increases the samplin...
research
12/24/2021

Enabling Real-time On-chip Audio Super Resolution for Bone Conduction Microphones

Voice communication using the air conduction microphone in noisy environ...
research
10/24/2022

Decimated Prony's Method for Stable Super-resolution

We study recovery of amplitudes and nodes of a finite impulse train from...

Please sign up or login with your details

Forgot password? Click here to reset