An Improved Measure of Musical Noise Based on Spectral Kurtosis

by   Matteo Torcoli, et al.

Audio processing methods operating on a time-frequency representation of the signal can introduce unpleasant sounding artifacts known as musical noise. These artifacts are observed in the context of audio coding, speech enhancement, and source separation. The change in kurtosis of the power spectrum introduced during the processing was shown to correlate with the human perception of musical noise in the context of speech enhancement, leading to the proposal of measures based on it. These baseline measures are here shown to correlate with human perception only in a limited manner. As ground truth for the human perception, the results from two listening tests are considered: one involving audio coding and one involving source separation. Simple but effective perceptually motivated improvements are proposed and the resulting new measure is shown to clearly outperform the baselines in terms of correlation with the results of both listening tests. Moreover, with respect to the listening test on musical noise in audio coding, the exhibited correlation is nearly as good as the one exhibited by the Artifact-related Perceptual Score (APS), which was found to be the best objective measure for this task. The APS is however computationally very expensive. The proposed measure is easily computed, requiring only a fraction of the computational cost of the APS.


Objective Measures of Perceptual Audio Quality Reviewed: An Evaluation of Their Application Domain Dependence

Over the past few decades, computational methods have been developed to ...

An Investigation of the Effectiveness of Phase for Audio Classification

While log-amplitude mel-spectrogram has widely been used as the feature ...

Phase recovery with Bregman divergences for audio source separation

Time-frequency audio source separation is usually achieved by estimating...

Fréchet Audio Distance: A Metric for Evaluating Music Enhancement Algorithms

We propose the Fréchet Audio Distance (FAD), a novel, reference-free eva...

Audio declipping performance enhancement via crossfading

Some audio declipping methods produce waveforms that do not fully respec...

Structure and Automatic Segmentation of Dhrupad Vocal Bandish Audio

A Dhrupad vocal concert comprises a composition section that is interspe...

Multichannel Source Separation and Speech Enhancement Using the Convolutive Transfer Function

This paper addresses the problem of audio source recovery from multichan...