Utterance-level neural confidence measure for end-to-end children speech recognition

09/16/2021
by   Wei Liu, et al.
0

Confidence measure is a performance index of particular importance for automatic speech recognition (ASR) systems deployed in real-world scenarios. In the present study, utterance-level neural confidence measure (NCM) in end-to-end automatic speech recognition (E2E ASR) is investigated. The E2E system adopts the joint CTC-attention Transformer architecture. The prediction of NCM is formulated as a task of binary classification, i.e., accept/reject the input utterance, based on a set of predictor features acquired during the ASR decoding process. The investigation is focused on evaluating and comparing the efficacies of predictor features that are derived from different internal and external modules of the E2E system. Experiments are carried out on children speech, for which state-of-the-art ASR systems show less than satisfactory performance and robust confidence measure is particularly useful. It is noted that predictor features related to acoustic information of speech play a more important role in estimating confidence measure than those related to linguistic information. N-best score features show significantly better performance than single-best ones. It has also been shown that the metrics of EER and AUC are not appropriate to evaluate the NCM of a mismatched ASR with significant performance gap.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2022

Automatic Speech recognition for Speech Assessment of Preschool Children

The acoustic and linguistic features of preschool speech are investigate...
research
04/26/2021

Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction

Confidence scores are very useful for downstream applications of automat...
research
05/18/2023

Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System

Estimating confidence scores for recognition results is a classic task i...
research
04/09/2019

Performance Monitoring for End-to-End Speech Recognition

Measuring performance of an automatic speech recognition (ASR) system wi...
research
02/10/2021

NUVA: A Naming Utterance Verifier for Aphasia Treatment

Anomia (word-finding difficulties) is the hallmark of aphasia, an acquir...
research
05/01/2022

Bilingual End-to-End ASR with Byte-Level Subwords

In this paper, we investigate how the output representation of an end-to...
research
05/23/2023

Evaluating OpenAI's Whisper ASR for Punctuation Prediction and Topic Modeling of life histories of the Museum of the Person

Automatic speech recognition (ASR) systems play a key role in applicatio...

Please sign up or login with your details

Forgot password? Click here to reset