Bi-Directional Lattice Recurrent Neural Networks for Confidence Estimation

10/30/2018
by   Qiujia Li, et al.
0

The standard approach to mitigate errors made by an automatic speech recognition system is to use confidence scores associated with each predicted word. In the simplest case, these scores are word posterior probabilities whilst more complex schemes utilise bi-directional recurrent neural network (BiRNN) models. A number of upstream and downstream applications, however, rely on confidence scores assigned not only to 1-best hypotheses but to all words found in confusion networks or lattices. These include but are not limited to speaker adaptation, semi-supervised training and information retrieval. Although word posteriors could be used in those applications as confidence scores, they are known to have reliability issues. To make improved confidence scores more generally available, this paper shows how BiRNNs can be extended from 1-best sequences to confusion network and lattice structures. Experiments are conducted using one of the Cambridge University submissions to the IARPA OpenKWS 2016 competition. The results show that confusion network and lattice-based BiRNNs can provide a significant improvement in confidence estimation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2019

Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks

Recently, there has been growth in providers of speech transcription ser...
research
10/30/2018

Confidence Estimation and Deletion Prediction Using Bidirectional Recurrent Neural Networks

The standard approach to assess reliability of automatic speech transcri...
research
07/22/2019

On Modeling ASR Word Confidence

We present a new method for computing ASR word confidences that effectiv...
research
08/18/2017

Future Word Contexts in Neural Network Language Models

Recently, bidirectional recurrent network language models (bi-RNNLMs) ha...
research
12/16/2022

Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition

This paper presents a class of new fast non-trainable entropy-based conf...
research
02/29/2020

Voice trigger detection from LVCSR hypothesis lattices using bidirectional lattice recurrent neural networks

We propose a method to reduce false voice triggers of a speech-enabled p...
research
04/03/2017

Neural Lattice-to-Sequence Models for Uncertain Inputs

The input to a neural sequence-to-sequence model is often determined by ...

Please sign up or login with your details

Forgot password? Click here to reset