Vectorization of hypotheses and speech for faster beam search in encoder decoder-based speech recognition

11/12/2018
by   Hiroshi Seki, et al.
0

Attention-based encoder decoder network uses a left-to-right beam search algorithm in the inference step. The current beam search expands hypotheses and traverses the expanded hypotheses at the next time step. This traversal is implemented using a for-loop program in general, and it leads to speed down of the recognition process. In this paper, we propose a parallelism technique for beam search, which accelerates the search process by vectorizing multiple hypotheses to eliminate the for-loop program. We also propose a technique to batch multiple speech utterances for off-line recognition use, which reduces the for-loop program with regard to the traverse of multiple utterances. This extension is not trivial during beam search unlike during training due to several pruning and thresholding techniques for efficient decoding. In addition, our method can combine scores of external modules, RNNLM and CTC, in a batch as shallow fusion. We achieved 3.7 x speedup compared with the original beam search algorithm by vectoring hypotheses, and achieved 10.5 x speedup by further changing processing unit to GPU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2020

Accelerating RNN Transducer Inference via One-Step Constrained Beam Search

We propose a one-step constrained (OSC) beam search to accelerate recurr...
research
07/24/2023

Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition

Although frame-based models, such as CTC and transducers, have an affini...
research
10/18/2021

Efficient Sequence Training of Attention Models using Approximative Recombination

Sequence discriminative training is a great tool to improve the performa...
research
02/06/2017

Beam Search Strategies for Neural Machine Translation

The basic concept in Neural Machine Translation (NMT) is to train a larg...
research
02/28/2023

A Token-Wise Beam Search Algorithm for RNN-T

Standard Recurrent Neural Network Transducers (RNN-T) decoding algorithm...
research
09/09/2019

A Quantum Search Decoder for Natural Language Processing

Probabilistic language models, e.g. those based on an LSTM, often face t...
research
03/21/2022

Enhancing Speech Recognition Decoding via Layer Aggregation

Recently proposed speech recognition systems are designed to predict usi...

Please sign up or login with your details

Forgot password? Click here to reset