GPU-Accelerated Viterbi Exact Lattice Decoder for Batched Online and Offline Speech Recognition

10/22/2019
by   Hugo Braun, et al.
0

We present an optimized weighted finite-state transducer (WFST) decoder capable of online streaming and offline batch processing of audio using Graphics Processing Units (GPUs). The decoder is efficient in memory utilization, input/output bandwidth, and uses a novel Viterbi implementation designed to maximize parallelism. Memory savings enable the decoder to process larger graphs than previously possible while simultaneously supporting larger numbers of consecutive streams. GPU preprocessing of lattice segments enable intermediate lattice results to be returned to the requestor during streaming inference. Collectively, the proposed improvements achieve up to a 240x speedup over single core CPU decoding, and up to 40x faster decoding than the current state-of-the-art GPU decoder, while returning equivalent results. This architecture also makes deployment of production-grade models on hardware ranging from large data center servers to low-power edge devices practical.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2018

A GPU-based WFST Decoder with Exact Lattice Generation

We describe initial work on an extension of the Kaldi toolkit that suppo...
research
06/23/2023

Implementing contextual biasing in GPU decoder for online ASR

GPU decoding significantly accelerates the output of ASR predictions. Wh...
research
09/11/2020

Fast LDPC GPU Decoder for Cloud RAN

The GPU as a digital signal processing accelerator for cloud RAN is inve...
research
11/17/2021

Accelerating JPEG Decompression on GPUs

The JPEG compression format has been the standard for lossy image compre...
research
04/13/2017

Mobile Keyboard Input Decoding with Finite-State Transducers

We propose a finite-state transducer (FST) representation for the models...
research
12/16/2018

Multi-Stream LDPC Decoder on GPU of Mobile Devices

Low-density parity check (LDPC) codes have been extensively applied in m...
research
12/15/2022

FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference

Fusion-in-Decoder (FiD) is a powerful retrieval-augmented language model...

Please sign up or login with your details

Forgot password? Click here to reset