Memory-efficient Speech Recognition on Smart Devices

02/23/2021
by   Ganesh Venkatesh, et al.
0

Recurrent transducer models have emerged as a promising solution for speech recognition on the current and next generation smart devices. The transducer models provide competitive accuracy within a reasonable memory footprint alleviating the memory capacity constraints in these devices. However, these models access parameters from off-chip memory for every input time step which adversely effects device battery life and limits their usability on low-power devices. We address transducer model's memory access concerns by optimizing their model architecture and designing novel recurrent cell designs. We demonstrate that i) model's energy cost is dominated by accessing model weights from off-chip memory, ii) transducer model architecture is pivotal in determining the number of accesses to off-chip memory and just model size is not a good proxy, iii) our transducer model optimizations and novel recurrent cell reduces off-chip memory accesses by 4.5x and model size by 2x with minimal accuracy impact.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2018

Low-Dimensional Bottleneck Features for On-Device Continuous Speech Recognition

Low power digital signal processors (DSPs) typically have a very limited...
research
03/15/2023

Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models

Continued improvements in machine learning techniques offer exciting new...
research
03/25/2021

Real-time low-resource phoneme recognition on edge devices

While speech recognition has seen a surge in interest and research over ...
research
05/10/2019

MobiVSR: A Visual Speech Recognition Solution for Mobile Devices

Visual speech recognition (VSR) is the task of recognizing spoken langua...
research
07/14/2023

Towards Model-Size Agnostic, Compute-Free, Memorization-based Inference of Deep Learning

The rapid advancement of deep neural networks has significantly improved...
research
09/27/2016

Training a Probabilistic Graphical Model with Resistive Switching Electronic Synapses

Current large scale implementations of deep learning and data mining req...
research
03/31/2021

Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices

We demonstrate that 1x1-convolutions in 1D time-channel separable convol...

Please sign up or login with your details

Forgot password? Click here to reset