Speech Quality Assessment through MOS using Non-Matching References

06/24/2022
by   Pranay Manocha, et al.
0

Human judgments obtained through Mean Opinion Scores (MOS) are the most reliable way to assess the quality of speech signals. However, several recent attempts to automatically estimate MOS using deep learning approaches lack robustness and generalization capabilities, limiting their use in real-world applications. In this work, we present a novel framework, NORESQA-MOS, for estimating the MOS of a speech signal. Unlike prior works, our approach uses non-matching references as a form of conditioning to ground the MOS estimation by neural networks. We show that NORESQA-MOS provides better generalization and more robust MOS estimation than previous state-of-the-art methods such as DNSMOS and NISQA, even though we use a smaller training set. Moreover, we also show that our generic framework can be combined with other learning methods such as self-supervised learning and can further supplement the benefits from these methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2021

NORESQA – A Framework for Speech Quality Assessment using Non-Matching References

The perceptual task of speech quality assessment (SQA) is a challenging ...
research
03/16/2019

Non-intrusive speech quality assessment using neural networks

Estimating the perceived quality of an audio signal is critical for many...
research
11/28/2016

AutoMOS: Learning a non-intrusive assessor of naturalness-of-speech

Developers of text-to-speech synthesizers (TTS) often make use of human ...
research
10/01/2020

SESQA: semi-supervised learning for speech quality assessment

Automatic speech quality assessment is an important, transversal task wh...
research
03/22/2022

Residual-Guided Non-Intrusive Speech Quality Assessment

This paper proposes an approach to improve Non-Intrusive speech quality ...
research
06/28/2022

Comparison of Speech Representations for the MOS Prediction System

Automatic methods to predict Mean Opinion Score (MOS) of listeners have ...
research
02/07/2022

Deep Impulse Responses: Estimating and Parameterizing Filters with Deep Networks

Impulse response estimation in high noise and in-the-wild settings, with...

Please sign up or login with your details

Forgot password? Click here to reset