WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution

06/16/2021
by   Kexun Zhang, et al.
0

Audio super-resolution is the task of constructing a high-resolution (HR) audio from a low-resolution (LR) audio by adding the missing band. Previous methods based on convolutional neural networks and mean squared error training objective have relatively low performance, while adversarial generative models are difficult to train and tune. Recently, normalizing flow has attracted a lot of attention for its high performance, simple training and fast inference. In this paper, we propose WSRGlow, a Glow-based waveform generative model to perform audio super-resolution. Specifically, 1) we integrate WaveNet and Glow to directly maximize the exact likelihood of the target HR audio conditioned on LR information; and 2) to exploit the audio information from low-resolution audio, we propose an LR audio encoder and an STFT encoder, which encode the LR information from the time domain and frequency domain respectively. The experimental results show that the proposed model is easier to train and outperforms the previous works in terms of both objective and perceptual quality. WSRGlow is also the first model to produce 48kHz waveforms from 12kHz LR audio.

READ FULL TEXT
research
09/13/2023

AudioSR: Versatile Audio Super-resolution at Scale

Audio super-resolution is a fundamental task that predicts high-frequenc...
research
09/30/2021

An investigation of pre-upsampling generative modelling and Generative Adversarial Networks in audio super resolution

There have been several successful deep learning models that perform aud...
research
06/11/2021

Catch-A-Waveform: Learning to Generate Audio from a Single Short Example

Models for audio generation are typically trained on hours of recordings...
research
05/20/2022

Diverse super-resolution with pretrained deep hiererarchical VAEs

Image super-resolution is a one-to-many problem, but most deep-learning ...
research
11/18/2015

Super-Resolution with Deep Convolutional Sufficient Statistics

Inverse problems in image and audio, and super-resolution in particular,...
research
12/08/2022

High Quality Audio Coding with MDCTNet

We propose a neural audio generative model, MDCTNet, operating in the pe...
research
06/20/2023

Phase Repair for Time-Domain Convolutional Neural Networks in Music Super-Resolution

Audio Super-Resolution (SR) is an important topic in the field of audio ...

Please sign up or login with your details

Forgot password? Click here to reset