Psychoacoustic Calibration of Loss Functions for Efficient End-to-End Neural Audio Coding

12/31/2020
by   Kai Zhen, et al.
0

Conventional audio coding technologies commonly leverage human perception of sound, or psychoacoustics, to reduce the bitrate while preserving the perceptual quality of the decoded audio signals. For neural audio codecs, however, the objective nature of the loss function usually leads to suboptimal sound quality as well as high run-time complexity due to the large model size. In this work, we present a psychoacoustic calibration scheme to re-define the loss functions of neural audio coding systems so that it can decode signals more perceptually similar to the reference, yet with a much lower model complexity. The proposed loss function incorporates the global masking threshold, allowing the reconstruction error that corresponds to inaudible artifacts. Experimental results show that the proposed model outperforms the baseline neural codec twice as large and consuming 23.4 With the proposed method, a lightweight neural codec, with only 0.9 million parameters, performs near-transparent audio coding comparable with the commercial MPEG-1 Audio Layer III codec at 112 kbps.

READ FULL TEXT
research
12/08/2022

High Quality Audio Coding with MDCTNet

We propose a neural audio generative model, MDCTNet, operating in the pe...
research
11/20/2019

Perceptual Loss Function for Neural Modelling of Audio Systems

This work investigates alternate pre-emphasis filters used as part of th...
research
06/27/2022

Uncertainty Calibration for Deep Audio Classifiers

Although deep Neural Networks (DNNs) have achieved tremendous success in...
research
05/08/2018

PAD-Net: A Perception-Aided Single Image Dehazing Network

In this work, we investigate the possibility of replacing the ℓ_2 loss w...
research
12/11/2019

Learning to Model Aspects of Hearing Perception Using Neural Loss Functions

We present a framework to model the perceived quality of audio signals b...
research
11/04/2022

Binaural Rendering of Ambisonic Signals by Neural Networks

Binaural rendering of ambisonic signals is of broad interest to virtual ...
research
03/01/2021

Penalized Projected Kernel Calibration for Computer Models

Projected kernel calibration is known to be theoretically superior, its ...

Please sign up or login with your details

Forgot password? Click here to reset