NoRefER: a Referenceless Quality Metric for Automatic Speech Recognition via Semi-Supervised Language Model Fine-Tuning with Contrastive Learning

06/21/2023
by   Kamer Ali Yuksel, et al.
0

This paper introduces NoRefER, a novel referenceless quality metric for automatic speech recognition (ASR) systems. Traditional reference-based metrics for evaluating ASR systems require costly ground-truth transcripts. NoRefER overcomes this limitation by fine-tuning a multilingual language model for pair-wise ranking ASR hypotheses using contrastive learning with Siamese network architecture. The self-supervised NoRefER exploits the known quality relationships between hypotheses from multiple compression levels of an ASR for learning to rank intra-sample hypotheses by quality, which is essential for model comparisons. The semi-supervised version also uses a referenced dataset to improve its inter-sample quality ranking, which is crucial for selecting potentially erroneous samples. The results indicate that NoRefER correlates highly with reference-based metrics and their intra-sample ranks, indicating a high potential for referenceless ASR evaluation or a/b testing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2021

Multiple-hypothesis CTC-based semi-supervised adaptation of end-to-end speech recognition

This paper proposes an adaptation method for end-to-end speech recogniti...
research
04/13/2021

EAT: Enhanced ASR-TTS for Self-supervised Speech Recognition

Self-supervised ASR-TTS models suffer in out-of-domain data conditions. ...
research
10/11/2021

Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric

Measuring automatic speech recognition (ASR) system quality is critical ...
research
05/02/2022

Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding

Spoken language understanding (SLU) is an essential task for machines to...
research
05/27/2022

Contrastive Siamese Network for Semi-supervised Speech Recognition

This paper introduces contrastive siamese (c-siam) network, an architect...
research
03/06/2020

Semi-supervised Development of ASR Systems for Multilingual Code-switched Speech in Under-resourced Languages

This paper reports on the semi-supervised development of acoustic and la...

Please sign up or login with your details

Forgot password? Click here to reset