Coded Speech Quality Measurement by a Non-Intrusive PESQ-DNN

04/18/2023
by   Ziyi Xu, et al.
0

Wideband codecs such as AMR-WB or EVS are widely used in (mobile) speech communication. Evaluation of coded speech quality is often performed subjectively by an absolute category rating (ACR) listening test. However, the ACR test is impractical for online monitoring of speech communication networks. Perceptual evaluation of speech quality (PESQ) is one of the widely used metrics instrumentally predicting the results of an ACR test. However, the PESQ algorithm requires an original reference signal, which is usually unavailable in network monitoring, thus limiting its applicability. NISQA is a new non-intrusive neural-network-based speech quality measure, focusing on super-wideband speech signals. In this work, however, we aim at predicting the well-known PESQ metric using a non-intrusive PESQ-DNN model. We illustrate the potential of this model by predicting the PESQ scores of wideband-coded speech obtained from AMR-WB or EVS codecs operating at different bitrates in noisy, tandeming, and error-prone transmission conditions. We compare our methods with the state-of-the-art network topologies of QualityNet, WaweNet, and DNSMOS – all applied to PESQ prediction – by measuring the mean absolute error (MAE) and the linear correlation coefficient (LCC). The proposed PESQ-DNN offers the best total MAE and LCC of 0.11 and 0.92, respectively, in conditions without frame loss, and still is best when including frame loss. Note that our model could be similarly used to non-intrusively predict POLQA or other (intrusive) metrics. Upon article acceptance, code will be provided at GitHub.

READ FULL TEXT

page 2

page 6

page 7

page 8

page 9

page 10

page 11

page 12

research
11/10/2021

HASA-net: A non-intrusive hearing-aid speech assessment network

Without the need of a clean reference, non-intrusive speech assessment m...
research
05/04/2022

Does a PESQNet (Loss) Require a Clean Reference Input? The Original PESQ Does, But ACR Listening Tests Don't

Perceptual evaluation of speech quality (PESQ) requires a clean speech r...
research
10/28/2020

DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors

Human subjective evaluation is the gold standard to evaluate speech qual...
research
04/09/2021

Speech Quality Assessment in Crowdsourcing: Comparison Category Rating Method

Traditionally, Quality of Experience (QoE) for a communication system is...
research
07/29/2020

DNN No-Reference PSTN Speech Quality Prediction

Classic public switched telephone networks (PSTN) are often a black box ...
research
06/25/2018

Convolutional Neural Networks to Enhance Coded Speech

Enhancing coded speech suffering from far-end acoustic background noise,...
research
10/10/2019

Hierarchical Representation Network for Steganalysis of QIM Steganography in Low-Bit-Rate Speech Signals

With the Volume of Voice over IP (VoIP) traffic rises shapely, more and ...

Please sign up or login with your details

Forgot password? Click here to reset