Objective Metrics to Evaluate Residual-Echo Suppression During Double-Talk

07/15/2021
by   Amir Ivry, et al.
0

Human subjective evaluation is optimal to assess speech quality for human perception. The recently introduced deep noise suppression mean opinion score (DNSMOS) metric was shown to estimate human ratings with great accuracy. The signal-to-distortion ratio (SDR) metric is widely used to evaluate residual-echo suppression (RES) systems by estimating speech quality during double-talk. However, since the SDR is affected by both speech distortion and residual-echo presence, it does not correlate well with human ratings according to the DNSMOS. To address that, we introduce two objective metrics to separately quantify the desired-speech maintained level (DSML) and residual-echo suppression level (RESL) during double-talk. These metrics are evaluated using a deep learning-based RES-system with a tunable design parameter. Using 280 hours of real and simulated recordings, we show that the DSML and RESL correlate well with the DNSMOS with high generalization to various setups. Also, we empirically investigate the relation between tuning the RES-system design parameter and the DSML-RESL tradeoff it creates and offer a practical design scheme for dynamic system requirements.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2020

DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors

Human subjective evaluation is the gold standard to evaluate speech qual...
research
06/25/2021

Deep Residual Echo Suppression with A Tunable Tradeoff Between Signal Distortion and Echo Suppression

In this paper, we propose a residual echo suppression method using a UNe...
research
07/29/2020

Investigation of Phase Distortion on Perceived Speech Quality for Hearing-impaired Listeners

Phase serves as a critical component of speech that influences the quali...
research
05/24/2023

PLCMOS – a data-driven non-intrusive metric for the evaluation of packet loss concealment algorithms

Speech quality assessment is a problem for every researcher working on m...
research
11/28/2016

AutoMOS: Learning a non-intrusive assessor of naturalness-of-speech

Developers of text-to-speech synthesizers (TTS) often make use of human ...
research
08/03/2023

TDMD: A Database for Dynamic Color Mesh Subjective and Objective Quality Explorations

Dynamic colored meshes (DCM) are widely used in various applications; ho...
research
10/25/2020

A Crowdsourcing Extension of the ITU-T Recommendation P.835 with Validation

The quality of the speech communication systems, which include noise sup...

Please sign up or login with your details

Forgot password? Click here to reset