Residual-Guided Non-Intrusive Speech Quality Assessment

03/22/2022
by   Zhe Ye, et al.
6

This paper proposes an approach to improve Non-Intrusive speech quality assessment(NI-SQA) based on the residuals between impaired speech and enhanced speech. The difficulty in our task is particularly lack of information, for which the corresponding reference speech is absent. We generate an enhanced speech on the impaired speech to compensate for the absence of the reference audio, then pair the information of residuals with the impaired speech. Compared to feeding the impaired speech directly into the model, residuals could bring some extra helpful information from the contrast in enhancement. The human ear is sensitive to certain noises but different to deep learning model. Causing the Mean Opinion Score(MOS) the model predicted is not enough to fit our subjective sensitive well and causes deviation. These residuals have a close relationship to reference speech and then improve the ability of the deep learning models to predict MOS. During the training phase, experimental results demonstrate that paired with residuals can quickly obtain better evaluation indicators under the same conditions. Furthermore, our final results improved 31.3 percent and 14.1 percent, respectively, in PLCC and RMSE.

READ FULL TEXT
research
04/02/2021

MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment

The objective speech quality assessment is usually conducted by comparin...
research
09/16/2021

NORESQA – A Framework for Speech Quality Assessment using Non-Matching References

The perceptual task of speech quality assessment (SQA) is a challenging ...
research
08/16/2018

Quality-Net: An End-to-End Non-intrusive Speech Quality Assessment Model based on BLSTM

Nowadays, most of the objective speech quality assessment tools (e.g., p...
research
09/04/2023

BadSQA: Stealthy Backdoor Attacks Using Presence Events as Triggers in Non-Intrusive Speech Quality Assessment

Non-Intrusive speech quality assessment (NISQA) has gained significant a...
research
08/09/2020

Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling

While deep learning has made impressive progress in speech synthesis and...
research
06/24/2022

Speech Quality Assessment through MOS using Non-Matching References

Human judgments obtained through Mean Opinion Scores (MOS) are the most ...
research
02/01/2021

Twice Mixing: A Rank Learning based Quality Assessment Approach for Underwater Image Enhancement

To improve the quality of underwater images, various kinds of underwater...

Please sign up or login with your details

Forgot password? Click here to reset