Log In Sign Up

DPLM: A Deep Perceptual Spatial-Audio Localization Metric

by   Pranay Manocha, et al.

Subjective evaluations are critical for assessing the perceptual realism of sounds in audio-synthesis driven technologies like augmented and virtual reality. However, they are challenging to set up, fatiguing for users, and expensive. In this work, we tackle the problem of capturing the perceptual characteristics of localizing sounds. Specifically, we propose a framework for building a general purpose quality metric to assess spatial localization differences between two binaural recordings. We model localization similarity by utilizing activation-level distances from deep networks trained for direction of arrival (DOA) estimation. Our proposed metric (DPLM) outperforms baseline metrics on correlation with subjective ratings on a diverse set of datasets, even without the benefit of any human-labeled training data.


page 1

page 2

page 3

page 4


SAQAM: Spatial Audio Quality Assessment Metric

Audio quality assessment is critical for assessing the perceptual realis...

DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors

Human subjective evaluation is the gold standard to evaluate speech qual...

Towards a perceptual distance metric for auditory stimuli

Although perceptual (dis)similarity between sensory stimuli seems akin t...

A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences

Assessment of many audio processing tasks relies on subjective evaluatio...

Scoot: A Perceptual Metric for Facial Sketches

While it is trivial for humans to quickly assess the perceptual similari...

CDPAM: Contrastive learning for perceptual audio similarity

Many speech processing methods based on deep learning require an automat...

PerceptNet: Learning Perceptual Similarity of Haptic Textures in Presence of Unorderable Triplets

In order to design haptic icons or build a haptic vocabulary, we require...