Binary-decomposed DCNN for accelerating computation and compressing model without retraining

09/14/2017
by   Ryuji Kamiya, et al.
0

Recent trends show recognition accuracy increasing even more profoundly. Inference process of Deep Convolutional Neural Networks (DCNN) has a large number of parameters, requires a large amount of computation, and can be very slow. The large number of parameters also require large amounts of memory. This is resulting in increasingly long computation times and large model sizes. To implement mobile and other low performance devices incorporating DCNN, model sizes must be compressed and computation must be accelerated. To that end, this paper proposes Binary-decomposed DCNN, which resolves these issues without the need for retraining. Our method replaces real-valued inner-product computations with binary inner-product computations in existing network models to accelerate computation of inference and decrease model size without the need for retraining. Binary computations can be done at high speed using logical operators such as XOR and AND, together with bit counting. In tests using AlexNet with the ImageNet classification task, speed increased by a factor of 1.79, models were compressed by approximately 80 was limited to 1.20 sizes decreased by 81

READ FULL TEXT

page 4

page 6

page 7

research
10/14/2020

Binarization Methods for Motor-Imagery Brain-Computer Interface Classification

Successful motor-imagery brain-computer interface (MI-BCI) algorithms ei...
research
06/28/2017

Toward Computation and Memory Efficient Neural Network Acoustic Models with Binary Weights and Activations

Neural network acoustic models have significantly advanced state of the ...
research
05/26/2022

Acute Lymphoblastic Leukemia Detection Using Hypercomplex-Valued Convolutional Neural Networks

This paper features convolutional neural networks defined on hypercomple...
research
06/20/2022

nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models

The recent advance of self-supervised learning associated with the Trans...
research
12/22/2020

Compressing LSTM Networks by Matrix Product Operators

Long Short-Term Memory (LSTM) models are the building blocks of many sta...
research
02/09/2023

DeepCAM: A Fully CAM-based Inference Accelerator with Variable Hash Lengths for Energy-efficient Deep Neural Networks

With ever increasing depth and width in deep neural networks to achieve ...
research
07/18/2017

On the Computation of Neumann Series

This paper proposes new factorizations for computing the Neumann series....

Please sign up or login with your details

Forgot password? Click here to reset