RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting

08/31/2023
by   Hui Wang, et al.
0

Automatic Mean Opinion Score (MOS) prediction is crucial to evaluate the perceptual quality of the synthetic speech. While recent approaches using pre-trained self-supervised learning (SSL) models have shown promising results, they only partly address the data scarcity issue for the feature extractor. This leaves the data scarcity issue for the decoder unresolved and leading to suboptimal performance. To address this challenge, we propose a retrieval-augmented MOS prediction method, dubbed RAMP, to enhance the decoder's ability against the data scarcity issue. A fusing network is also proposed to dynamically adjust the retrieval scope for each instance and the fusion weights based on the predictive confidence. Experimental results show that our proposed method outperforms the existing methods in multiple scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2023

A vector quantized masked autoencoder for audiovisual speech emotion recognition

While fully-supervised models have been shown to be effective for audiov...
research
03/07/2023

Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation

Self-supervised learning (SSL) has recently shown remarkable results in ...
research
04/07/2022

DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores

Mean opinion score (MOS) is a typical subjective evaluation metric for s...
research
07/11/2023

On the Effectiveness of Speech Self-supervised Learning for Music

Self-supervised learning (SSL) has shown promising results in various sp...
research
04/08/2022

Automatic Pronunciation Assessment using Self-Supervised Speech Representation Learning

Self-supervised learning (SSL) approaches such as wav2vec 2.0 and HuBERT...
research
04/23/2022

Improving Self-Supervised Learning-based MOS Prediction Networks

MOS (Mean Opinion Score) is a subjective method used for the evaluation ...
research
06/22/2022

UniUD-FBK-UB-UniBZ Submission to the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge 2022

This report presents the technical details of our submission to the EPIC...

Please sign up or login with your details

Forgot password? Click here to reset