Bifocal Neural ASR: Exploiting Keyword Spotting for Inference Optimization

by   Jonathan Macoskey, et al.

We present Bifocal RNN-T, a new variant of the Recurrent Neural Network Transducer (RNN-T) architecture designed for improved inference time latency on speech recognition tasks. The architecture enables a dynamic pivot for its runtime compute pathway, namely taking advantage of keyword spotting to select which component of the network to execute for a given audio frame. To accomplish this, we leverage a recurrent cell we call the Bifocal LSTM (BFLSTM), which we detail in the paper. The architecture is compatible with other optimization strategies such as quantization, sparsification, and applying time-reduction layers, making it especially applicable for deployed, real-time speech recognition settings. We present the architecture and report comparative experimental results on voice-assistant speech recognition tasks. Specifically, we show our proposed Bifocal RNN-T can improve inference cost by 29.1



There are no comments yet.


page 3


Amortized Neural Networks for Low-Latency Speech Recognition

We introduce Amortized Neural Networks (AmNets), a compute cost- and lat...

4-bit Quantization of LSTM-based Speech Recognition Models

We investigate the impact of aggressive low-precision representations of...

Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting

Keyword spotting (KWS) constitutes a major component of human-technology...

Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition

Modeling unit and model architecture are two key factors of Recurrent Ne...

An Ultra-low Power RNN Classifier for Always-On Voice Wake-Up Detection Robust to Real-World Scenarios

We present in this paper an ultra-low power (ULP) Recurrent Neural Netwo...

Hierarchical Neural Network Architecture In Keyword Spotting

Keyword Spotting (KWS) provides the start signal of ASR problem, and thu...

Temporal Feedback Convolutional Recurrent Neural Networks for Keyword Spotting

While end-to-end learning has become a trend in deep learning, the model...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.