Trace norm regularization and faster inference for embedded speech recognition RNNs

10/25/2017
by   Markus Kliegl, et al.
0

We propose and evaluate new techniques for compressing and speeding up dense matrix multiplications as found in the fully connected and recurrent layers of neural networks for embedded large vocabulary continuous speech recognition (LVCSR). For compression, we introduce and study a trace norm regularization technique for training low rank factored versions of matrix multiplications. Compared to standard low rank training, we show that our method more consistently leads to good accuracy versus number of parameter trade-offs and can be used to speed up training of large models. For speedup, we enable faster inference on ARM processors through new open sourced kernels optimized for small batch sizes, resulting in 3x to 7x speed ups over the widely used gemmlowp library. Beyond LVCSR, we expect our techniques and kernels to be more generally applicable to embedded neural networks with large fully connected or recurrent layers.

READ FULL TEXT
research
07/13/2021

Data-Driven Low-Rank Neural Network Compression

Despite many modern applications of Deep Neural Networks (DNNs), the lar...
research
05/29/2019

Rethinking Full Connectivity in Recurrent Neural Networks

Recurrent neural networks (RNNs) are omnipresent in sequence modeling ta...
research
05/04/2021

Performance Evaluation of Deep Convolutional Maxout Neural Network in Speech Recognition

In this paper, various structures and methods of Deep Artificial Neural ...
research
04/07/2015

Efficient SDP Inference for Fully-connected CRFs Based on Low-rank Decomposition

Conditional Random Fields (CRF) have been widely used in a variety of co...
research
10/30/2019

Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank Transformer

High performing deep neural networks come at the cost of computational c...
research
03/15/2023

Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models

Continued improvements in machine learning techniques offer exciting new...
research
06/01/2018

Training LSTM Networks with Resistive Cross-Point Devices

In our previous work we have shown that resistive cross point devices, s...

Please sign up or login with your details

Forgot password? Click here to reset