VQ-T: RNN Transducers using Vector-Quantized Prediction Network States

08/03/2022
by   Jiatong Shi, et al.
0

Beam search, which is the dominant ASR decoding algorithm for end-to-end models, generates tree-structured hypotheses. However, recent studies have shown that decoding with hypothesis merging can achieve a more efficient search with comparable or better performance. But, the full context in recurrent networks is not compatible with hypothesis merging. We propose to use vector-quantized long short-term memory units (VQ-LSTM) in the prediction network of RNN transducers. By training the discrete representation jointly with the ASR network, hypotheses can be actively merged for lattice generation. Our experiments on the Switchboard corpus show that the proposed VQ RNN transducers improve ASR performance over transducers with regular prediction networks while also producing denser lattices with a very low oracle word error rate (WER) for the same beam size. Additional language model rescoring experiments also demonstrate the effectiveness of the proposed lattice generation scheme.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/12/2020

Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging

End-to-end models that condition the output label sequence on all previo...
research
02/10/2020

Accelerating RNN Transducer Inference via One-Step Constrained Beam Search

We propose a one-step constrained (OSC) beam search to accelerate recurr...
research
11/05/2019

RNN-T For Latency Controlled ASR With Improved Beam Search

Neural transducer-based systems such as RNN Transducers (RNN-T) for auto...
research
07/27/2020

Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition

In this work, we propose a novel and efficient minimum word error rate (...
research
11/06/2018

Discriminative training of RNNLMs with the average word error criterion

In automatic speech recognition (ASR), recurrent neural language models ...
research
02/28/2023

A Token-Wise Beam Search Algorithm for RNN-T

Standard Recurrent Neural Network Transducers (RNN-T) decoding algorithm...
research
11/01/2021

Sequence Transduction with Graph-based Supervision

The recurrent neural network transducer (RNN-T) objective plays a major ...

Please sign up or login with your details

Forgot password? Click here to reset