Toward Computation and Memory Efficient Neural Network Acoustic Models with Binary Weights and Activations

06/28/2017
by   Liang Lu, et al.
0

Neural network acoustic models have significantly advanced state of the art speech recognition over the past few years. However, they are usually computationally expensive due to the large number of matrix-vector multiplications and nonlinearity operations. Neural network models also require significant amounts of memory for inference because of the large model size. For these two reasons, it is challenging to deploy neural network based speech recognizers on resource-constrained platforms such as embedded devices. This paper investigates the use of binary weights and activations for computation and memory efficient neural network acoustic models. Compared to real-valued weight matrices, binary weights require much fewer bits for storage, thereby cutting down the memory footprint. Furthermore, with binary weights or activations, the matrix-vector multiplications are turned into addition and subtraction operations, which are computationally much faster and more energy efficient for hardware platforms. In this paper, we study the applications of binary weights and activations for neural network acoustic modeling, reporting encouraging results on the WSJ and AMI corpora.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2023

Learning Discrete Weights and Activations Using the Local Reparameterization Trick

In computer vision and machine learning, a crucial challenge is to lower...
research
07/11/2018

A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network

In this paper, a time delay neural network (TDNN) based acoustic model i...
research
09/14/2017

Binary-decomposed DCNN for accelerating computation and compressing model without retraining

Recent trends show recognition accuracy increasing even more profoundly....
research
08/02/2016

Knowledge Distillation for Small-footprint Highway Networks

Deep learning has significantly advanced state-of-the-art of speech reco...
research
06/12/2020

AlgebraNets

Neural networks have historically been built layerwise from the set of f...
research
11/13/2022

FullPack: Full Vector Utilization for Sub-Byte Quantized Inference on General Purpose CPUs

Although prior art has demonstrated negligible accuracy drop in sub-byte...
research
06/29/2023

BinaryViT: Pushing Binary Vision Transformers Towards Convolutional Models

With the increasing popularity and the increasing size of vision transfo...

Please sign up or login with your details

Forgot password? Click here to reset