Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition

07/27/2020
by   Jinxi Guo, et al.
0

In this work, we propose a novel and efficient minimum word error rate (MWER) training method for RNN-Transducer (RNN-T). Unlike previous work on this topic, which performs on-the-fly limited-size beam-search decoding and generates alignment scores for expected edit-distance computation, in our proposed method, we re-calculate and sum scores of all the possible alignments for each hypothesis in N-best lists. The hypothesis probability scores and back-propagated gradients are calculated efficiently using the forward-backward algorithm. Moreover, the proposed method allows us to decouple the decoding and training processes, and thus we can perform offline parallel-decoding and MWER training for each subset iteratively. Experimental results show that this proposed semi-on-the-fly method can speed up the on-the-fly method by 6 times and result in a similar WER improvement (3.6 proposed MWER training can also effectively reduce high-deletion errors (9.2 WER-reduction) introduced by RNN-T models when EOS is added for endpointer. Further improvement can be achieved if we use a proposed RNN-T rescoring method to re-rank hypotheses and use external RNN-LM to perform additional rescoring. The best system achieves a 5 real far-field recordings and a 11.6

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2019

Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition

In this work, we propose minimum Bayes risk (MBR) training of RNN-Transd...
research
10/23/2020

On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer

Hybrid Autoregressive Transducer (HAT) is a recently proposed end-to-end...
research
02/28/2023

A Token-Wise Beam Search Algorithm for RNN-T

Standard Recurrent Neural Network Transducers (RNN-T) decoding algorithm...
research
04/27/2021

On Addressing Practical Challenges for RNN-Transducer

In this paper, several works are proposed to address practical challenge...
research
08/03/2022

VQ-T: RNN Transducers using Vector-Quantized Prediction Network States

Beam search, which is the dominant ASR decoding algorithm for end-to-end...
research
10/29/2022

Accelerating RNN-T Training and Inference Using CTC guidance

We propose a novel method to accelerate training and inference process o...
research
04/15/2022

Streaming Align-Refine for Non-autoregressive Deliberation

We propose a streaming non-autoregressive (non-AR) decoding algorithm to...

Please sign up or login with your details

Forgot password? Click here to reset