SpeechLMScore: Evaluating speech generation using speech language model

12/08/2022
by   Soumi Maiti, et al.
0

While human evaluation is the most reliable metric for evaluating speech generation systems, it is generally costly and time-consuming. Previous studies on automatic speech quality assessment address the problem by predicting human evaluation scores with machine learning models. However, they rely on supervised learning and thus suffer from high annotation costs and domain-shift problems. We propose SpeechLMScore, an unsupervised metric to evaluate generated speech using a speech-language model. SpeechLMScore computes the average log-probability of a speech signal by mapping it into discrete tokens and measures the average probability of generating the sequence of tokens. Therefore, it does not require human annotation and is a highly scalable framework. Evaluation results demonstrate that the proposed metric shows a promising correlation with human evaluation scores on different speech generation tasks including voice conversion, text-to-speech, and speech enhancement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/09/2020

Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling

While deep learning has made impressive progress in speech synthesis and...
research
12/16/2022

BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric

End-to-End speech-to-speech translation (S2ST) is generally evaluated wi...
research
12/02/2021

InfoLM: A New Metric to Evaluate Summarization Data2Text Generation

Assessing the quality of natural language generation systems through hum...
research
12/05/2019

Towards Robust Neural Vocoding for Speech Generation: A Survey

Recently, neural vocoders have been widely used in speech synthesis task...
research
02/28/2022

Rethinking and Refining the Distinct Metric

Distinct is a widely used automatic metric for evaluating the diversity ...
research
06/18/2023

MOSPC: MOS Prediction Based on Pairwise Comparison

As a subjective metric to evaluate the quality of synthesized speech, Me...
research
03/08/2021

Domain Controlled Title Generation with Human Evaluation

We study automatic title generation and present a method for generating ...

Please sign up or login with your details

Forgot password? Click here to reset