An investigation of pre-upsampling generative modelling and Generative Adversarial Networks in audio super resolution

09/30/2021
by   James King, et al.
0

There have been several successful deep learning models that perform audio super-resolution. Many of these approaches involve using preprocessed feature extraction which requires a lot of domain-specific signal processing knowledge to implement. Convolutional Neural Networks (CNNs) improved upon this framework by automatically learning filters. An example of a convolutional approach is AudioUNet, which takes inspiration from novel methods of upsampling images. Our paper compares the pre-upsampling AudioUNet to a new generative model that upsamples the signal before using deep learning to transform it into a more believable signal. Based on the EDSR network for image super-resolution, the newly proposed model outperforms UNet with a 20 distance and a mean opinion score of 4.06 compared to 3.82 for the two times upsampling case. AudioEDSR also has 87 incorporating AudioUNet into a Wasserstein GAN (with gradient penalty) (WGAN-GP) structure can affect training is also explored. Finally the effects artifacting has on the current state of the art is analysed and solutions to this problem are proposed. The methods used in this paper have broad applications to telephony, audio recognition and audio generation tasks.

READ FULL TEXT

page 3

page 7

page 9

research
06/16/2021

WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution

Audio super-resolution is the task of constructing a high-resolution (HR...
research
10/10/2018

Image Super-Resolution Using VDSR-ResNeXt and SRCGAN

Over the past decade, many Super Resolution techniques have been develop...
research
03/21/2019

Bandwidth Extension on Raw Audio via Generative Adversarial Networks

Neural network-based methods have recently demonstrated state-of-the-art...
research
12/20/2013

A Generative Product-of-Filters Model of Audio

We propose the product-of-filters (PoF) model, a generative model that d...
research
02/09/2023

Hypernetworks build Implicit Neural Representations of Sounds

Implicit Neural Representations (INRs) are nowadays used to represent mu...
research
11/03/2022

HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks

Implicit neural representations (INRs) are a rapidly growing research fi...
research
06/10/2021

Super-Resolution Image Reconstruction Based on Self-Calibrated Convolutional GAN

With the effective application of deep learning in computer vision, brea...

Please sign up or login with your details

Forgot password? Click here to reset