DNSMOS P.835: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors

10/05/2021
by   Chandan K A Reddy, et al.
0

Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. We have recently developed a non-intrusive speech quality metric called Deep Noise Suppression Mean Opinion Score (DNSMOS) using the scores from ITU-T Rec. P.808 subjective evaluation. The P.808 scores reflect the overall quality of the audio clip. ITU-T Rec. P.835 subjective evaluation framework gives the standalone quality scores of speech and background noise in addition to the overall quality. In this work, we train an objective metric based on P.835 human ratings that outputs 3 scores: i) speech quality (SIG), ii) background noise quality (BAK), and iii) the overall quality (OVRL) of the audio. The developed metric is highly correlated with human ratings, with a Pearson's Correlation Coefficient (PCC)=0.94 for SIG and PCC=0.98 for BAK and OVRL. This is the first non-intrusive P.835 predictor we are aware of. DNSMOS P.835 is made publicly available as an Azure service.

READ FULL TEXT
research
10/28/2020

DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors

Human subjective evaluation is the gold standard to evaluate speech qual...
research
05/24/2023

PLCMOS – a data-driven non-intrusive metric for the evaluation of packet loss concealment algorithms

Speech quality assessment is a problem for every researcher working on m...
research
04/13/2022

Predicting score distribution to improve non-intrusive speech quality estimation

Deep noise suppressors (DNS) have become an attractive solution to remov...
research
12/03/2020

Individually amplified text-to-speech

Text-to-speech (TTS) offers the opportunity to compensate for a hearing ...
research
08/12/2021

To Rate or Not To Rate: Investigating Evaluation Methods for Generated Co-Speech Gestures

While automatic performance metrics are crucial for machine learning of ...
research
06/16/2019

Parametric Resynthesis with neural vocoders

Noise suppression systems generally produce output speech with copromise...
research
07/31/2020

A Pyramid Recurrent Network for Predicting Crowdsourced Speech-Quality Ratings of Real-World Signals

The real-world capabilities of objective speech quality measures are lim...

Please sign up or login with your details

Forgot password? Click here to reset