A GPU-based WFST Decoder with Exact Lattice Generation

04/09/2018
by   Zhehuai Chen, et al.
0

We describe initial work on an extension of the Kaldi toolkit that supports weighted finite-state transducer (WFST) decoding on Graphics Processing Units (GPUs). We implement token recombination as an atomic GPU operation in order to fully parallelize the Viterbi beam search, and propose a dynamic load balancing strategy for more efficient token passing scheduling among GPU threads. We also redesign the exact lattice generation and lattice pruning algorithms for better utilization of the GPUs. Experiments on the Switchboard corpus show that the proposed method achieves identical 1-best results and lattice quality in recognition and confidence measure tasks, while running 3 to 15 times faster than the single process Kaldi decoder. The above results are reported on different GPU architectures. Additionally we obtain a 46-fold speedup with sequence parallelism and multi-process service (MPS) in GPU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2019

GPU-Accelerated Viterbi Exact Lattice Decoder for Batched Online and Offline Speech Recognition

We present an optimized weighted finite-state transducer (WFST) decoder ...
research
11/25/2021

LET-Decoder: A WFST-based Lazy-evaluation Token-group Decoder with Exact Lattice Generation

We propose a novel lazy-evaluation token-group decoding algorithm with o...
research
12/02/2003

Benchmarking and Implementation of Probability-Based Simulations on Programmable Graphics Cards

The latest Graphics Processing Units (GPUs) are reported to reach up to ...
research
01/11/2017

Decoding with Finite-State Transducers on GPUs

Weighted finite automata and transducers (including hidden Markov models...
research
06/23/2023

Implementing contextual biasing in GPU decoder for online ASR

GPU decoding significantly accelerates the output of ASR predictions. Wh...
research
04/25/2023

LAST: Scalable Lattice-Based Speech Modelling in JAX

We introduce LAST, a LAttice-based Speech Transducer library in JAX. Wit...
research
10/22/2021

GPU-Accelerated Forward-Backward algorithm with Application to Lattice-Free MMI

We propose to express the forward-backward algorithm in terms of operati...

Please sign up or login with your details

Forgot password? Click here to reset