An evaluation of word-level confidence estimation for end-to-end automatic speech recognition

01/14/2021
by   Dan Oneata, et al.
0

Quantifying the confidence (or conversely the uncertainty) of a prediction is a highly desirable trait of an automatic system, as it improves the robustness and usefulness in downstream tasks. In this paper we investigate confidence estimation for end-to-end automatic speech recognition (ASR). Previous work has addressed confidence measures for lattice-based ASR, while current machine learning research mostly focuses on confidence measures for unstructured deep learning. However, as the ASR systems are increasingly being built upon deep end-to-end methods, there is little work that tries to develop confidence measures in this context. We fill this gap by providing an extensive benchmark of popular confidence methods on four well-known speech datasets. There are two challenges we overcome in adapting existing methods: working on structured data (sequences) and obtaining confidences at a coarser level than the predictions (words instead of tokens). Our results suggest that a strong baseline can be obtained by scaling the logits by a learnt temperature, followed by estimating the confidence as the negative entropy of the predictive distribution and, finally, sum pooling to aggregate at word level.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/25/2021

Residual Energy-Based Models for End-to-End Speech Recognition

End-to-end models with auto-regressive decoders have shown impressive re...
research
12/16/2022

Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition

This paper presents a class of new fast non-trainable entropy-based conf...
research
10/04/2019

Modeling Confidence in Sequence-to-Sequence Models

Recently, significant improvements have been achieved in various natural...
research
10/22/2020

Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition

For various speech-related tasks, confidence scores from a speech recogn...
research
03/11/2021

Learning Word-Level Confidence For Subword End-to-End ASR

We study the problem of word-level confidence estimation in subword-base...
research
04/09/2019

Performance Monitoring for End-to-End Speech Recognition

Measuring performance of an automatic speech recognition (ASR) system wi...
research
02/18/2020

Uncertainty in Structured Prediction

Uncertainty estimation is important for ensuring safety and robustness o...

Please sign up or login with your details

Forgot password? Click here to reset