Learning Word-Level Confidence For Subword End-to-End ASR

03/11/2021
by   David Qiu, et al.
0

We study the problem of word-level confidence estimation in subword-based end-to-end (E2E) models for automatic speech recognition (ASR). Although prior works have proposed training auxiliary confidence models for ASR systems, they do not extend naturally to systems that operate on word-pieces (WP) as their vocabulary. In particular, ground truth WP correctness labels are needed for training confidence models, but the non-unique tokenization from word to WP causes inaccurate labels to be generated. This paper proposes and studies two confidence models of increasing complexity to solve this problem. The final model uses self-attention to directly learn word-level confidence without needing subword tokenization, and exploits full context features from multiple hypotheses to improve confidence accuracy. Experiments on Voice Search and long-tail test sets show standard metrics (e.g., NCE, AUC, RMSE) improving substantially. The proposed confidence module also enables a model selection approach to combine an on-device E2E model with a hybrid model on the server to address the rare word recognition problem for the E2E model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/26/2021

Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction

Confidence scores are very useful for downstream applications of automat...
research
10/07/2021

Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition

As end-to-end automatic speech recognition (ASR) models reach promising ...
research
06/09/2023

Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition

End-to-end (E2E) systems have shown comparable performance to hybrid sys...
research
01/14/2021

An evaluation of word-level confidence estimation for end-to-end automatic speech recognition

Quantifying the confidence (or conversely the uncertainty) of a predicti...
research
12/16/2022

Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition

This paper presents a class of new fast non-trainable entropy-based conf...
research
05/18/2023

Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System

Estimating confidence scores for recognition results is a classic task i...
research
06/27/2023

Confidence-based Ensembles of End-to-End Speech Recognition Models

The number of end-to-end speech recognition models grows every year. The...

Please sign up or login with your details

Forgot password? Click here to reset