SAQAM: Spatial Audio Quality Assessment Metric

06/24/2022
by   Pranay Manocha, et al.
0

Audio quality assessment is critical for assessing the perceptual realism of sounds. However, the time and expense of obtaining ”gold standard” human judgments limit the availability of such data. For AR VR, good perceived sound quality and localizability of sources are among the key elements to ensure complete immersion of the user. Our work introduces SAQAM which uses a multi-task learning framework to assess listening quality (LQ) and spatialization quality (SQ) between any given pair of binaural signals without using any subjective data. We model LQ by training on a simulated dataset of triplet human judgments, and SQ by utilizing activation-level distances from networks trained for direction of arrival (DOA) estimation. We show that SAQAM correlates well with human responses across four diverse datasets. Since it is a deep network, the metric is differentiable, making it suitable as a loss function for other tasks. For example, simply replacing an existing loss with our metric yields improvement in a speech-enhancement network.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2021

DPLM: A Deep Perceptual Spatial-Audio Localization Metric

Subjective evaluations are critical for assessing the perceptual realism...
research
01/13/2020

A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences

Assessment of many audio processing tasks relies on subjective evaluatio...
research
02/09/2021

CDPAM: Contrastive learning for perceptual audio similarity

Many speech processing methods based on deep learning require an automat...
research
08/20/2017

Perceptual audio loss function for deep learning

PESQ and POLQA , are standards are standards for automated assessment of...
research
09/16/2021

NORESQA – A Framework for Speech Quality Assessment using Non-Matching References

The perceptual task of speech quality assessment (SQA) is a challenging ...
research
09/14/2023

Multi-dimensional Speech Quality Assessment in Crowdsourcing

Subjective speech quality assessment is the gold standard for evaluating...
research
03/11/2013

Linear NDCG and Pair-wise Loss

Linear NDCG is used for measuring the performance of the Web content qua...

Please sign up or login with your details

Forgot password? Click here to reset