CCATMos: Convolutional Context-aware Transformer Network for Non-intrusive Speech Quality Assessment

11/04/2022
by   Yuchen Liu, et al.
0

Speech quality assessment has been a critical component in many voice communication related applications such as telephony and online conferencing. Traditional intrusive speech quality assessment requires the clean reference of the degraded utterance to provide an accurate quality measurement. This requirement limits the usability of these methods in real-world scenarios. On the other hand, non-intrusive subjective measurement is the “golden standard" in evaluating speech quality as human listeners can intrinsically evaluate the quality of any degraded speech with ease. In this paper, we propose a novel end-to-end model structure called Convolutional Context-Aware Transformer (CCAT) network to predict the mean opinion score (MOS) of human raters. We evaluate our model on three MOS-annotated datasets spanning multiple languages and distortion types and submit our results to the ConferencingSpeech 2022 Challenge. Our experiments show that CCAT provides promising MOS predictions compared to current state-of-art non-intrusive speech assessment models with average Pearson correlation coefficient (PCC) increasing from 0.530 to 0.697 and average RMSE decreasing from 0.768 to 0.570 compared to the baseline model on the challenge evaluation test set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2021

MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment

The objective speech quality assessment is usually conducted by comparin...
research
10/20/2017

Detecting Online Hate Speech Using Context Aware Models

In the wake of a polarizing election, the cyber world is laden with hate...
research
01/28/2023

Layout-aware Webpage Quality Assessment

Identifying high-quality webpages is fundamental for real-world search e...
research
09/04/2023

BadSQA: Stealthy Backdoor Attacks Using Presence Events as Triggers in Non-Intrusive Speech Quality Assessment

Non-Intrusive speech quality assessment (NISQA) has gained significant a...
research
05/16/2020

Exploration of Audio Quality Assessment and Anomaly Localisation Using Attention Models

Many applications of speech technology require more and more audio data....
research
11/12/2022

Efficient Speech Quality Assessment using Self-supervised Framewise Embeddings

Automatic speech quality assessment is essential for audio researchers, ...
research
09/22/2020

iWash: A Smartwatch Handwashing Quality Assessment and Reminder System with Real-time Feedback in the Context of Infectious Disease

Washing hands properly and frequently is the simplest and most cost-effe...

Please sign up or login with your details

Forgot password? Click here to reset