Audio Spectral Enhancement: Leveraging Autoencoders for Low Latency Reconstruction of Long, Lossy Audio Sequences

08/08/2021
by   Darshan Deshpande, et al.
0

With active research in audio compression techniques yielding substantial breakthroughs, spectral reconstruction of low-quality audio waves remains a less indulged topic. In this paper, we propose a novel approach for reconstructing higher frequencies from considerably longer sequences of low-quality MP3 audio waves. Our technique involves inpainting audio spectrograms with residually stacked autoencoder blocks by manipulating individual amplitude and phase values in relation to perceptual differences. Our architecture presents several bottlenecks while preserving the spectral structure of the audio wave via skip-connections. We also compare several task metrics and demonstrate our visual guide to loss selection. Moreover, we show how to leverage differential quantization techniques to reduce the initial model size by more than half while simultaneously reducing inference time, which is crucial in real-world applications.

READ FULL TEXT

page 5

page 7

research
07/07/2021

SoundStream: An End-to-End Neural Audio Codec

We present SoundStream, a novel neural audio codec that can efficiently ...
research
07/22/2021

HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding

An autoencoder-based codec employs quantization to turn its bottleneck l...
research
01/18/2023

An investigation of the reconstruction capacity of stacked convolutional autoencoders for log-mel-spectrograms

In audio processing applications, the generation of expressive sounds ba...
research
12/09/2019

MITAS: A Compressed Time-Domain Audio Separation Network with Parameter Sharing

Deep learning methods have brought substantial advancements in speech se...
research
12/18/2018

Uniform Convergence Bounds for Codec Selection

We frame the problem of selecting an optimal audio encoding scheme as a ...
research
07/14/2020

A Deep Learning Approach for Low-Latency Packet Loss Concealment of Audio Signals in Networked Music Performance Applications

Networked Music Performance (NMP) is envisioned as a potential game chan...
research
01/26/2023

A simple model for pink noise from amplitude modulations

We propose a simple model for the origin of pink noise (or 1/f fluctuati...

Please sign up or login with your details

Forgot password? Click here to reset