AudioSR: Versatile Audio Super-resolution at Scale

09/13/2023
by   Haohe Liu, et al.
0

Audio super-resolution is a fundamental task that predicts high-frequency components for low-resolution audio, enhancing audio quality in digital applications. Previous methods have limitations such as the limited scope of audio types (e.g., music, speech) and specific bandwidth settings they can handle (e.g., 4kHz to 8kHz). In this paper, we introduce a diffusion-based generative model, AudioSR, that is capable of performing robust audio super-resolution on versatile audio types, including sound effects, music, and speech. Specifically, AudioSR can upsample any input audio signal within the bandwidth range of 2kHz to 16kHz to a high-resolution audio signal at 24kHz bandwidth with a sampling rate of 48kHz. Extensive objective evaluation on various audio super-resolution benchmarks demonstrates the strong result achieved by the proposed model. In addition, our subjective evaluation shows that AudioSR can acts as a plug-and-play module to enhance the generation quality of a wide range of audio generative models, including AudioLDM, Fastspeech2, and MusicGen. Our code and demo are available at https://audioldm.github.io/audiosr.

READ FULL TEXT

page 2

page 4

research
08/02/2017

Audio Super Resolution using Neural Networks

We introduce a new audio processing technique that increases the samplin...
research
06/16/2021

WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution

Audio super-resolution is the task of constructing a high-resolution (HR...
research
11/22/2022

AERO: Audio Super Resolution in the Spectral Domain

We present AERO, a audio super-resolution model that processes speech an...
research
10/27/2022

Conditioning and Sampling in Variational Diffusion Models for Speech Super-resolution

Recently, diffusion models (DMs) have been increasingly used in audio pr...
research
03/28/2022

Neural Vocoder is All You Need for Speech Super-resolution

Speech super-resolution (SR) is a task to increase speech sampling rate ...
research
12/24/2021

Enabling Real-time On-chip Audio Super Resolution for Bone Conduction Microphones

Voice communication using the air conduction microphone in noisy environ...
research
07/05/2019

Speech bandwidth extension with WaveNet

Large-scale mobile communication systems tend to contain legacy transmis...

Please sign up or login with your details

Forgot password? Click here to reset