Optimize Deep Convolutional Neural Network with Ternarized Weights and High Accuracy

07/20/2018
by   Zhezhi He, et al.
2

Deep convolution neural network has achieved great success in many artificial intelligence applications. However, its enormous model size and massive computation cost have become the main obstacle for deployment of such powerful algorithm in the low power and resource-limited embedded systems. As the countermeasure to this problem, in this work, we propose statistical weight scaling and residual expansion methods to reduce the bit-width of the whole network weight parameters to ternary values (i.e. -1, 0, +1), with the objectives to greatly reduce model size, computation cost and accuracy degradation caused by the model compression. With about 16x model compression rate, our ternarized ResNet-32/44/56 could outperform full-precision counterparts by 0.12 ternarization method with AlexNet and ResNet-18 on ImageNet dataset, which both achieve the best top-1 accuracy compared to recent similar works, with the same 16x compression rate. If further incorporating our residual expansion method, compared to the full-precision counterpart, our ternarized ResNet-18 even improves the top-5 accuracy by 0.61 only by 0.42 outperforms the recent ABC-Net by 1.03 accuracy, with around 1.25x higher compression rate and more than 6x computation reduction due to the weight sparsity.

READ FULL TEXT
research
10/02/2018

Simultaneously Optimizing Weight and Quantizer of Ternary Neural Network using Truncated Gaussian Approximation

In the past years, Deep convolution neural network has achieved great su...
research
11/30/2021

PokeBNN: A Binary Pursuit of Lightweight Accuracy

Top-1 ImageNet optimization promotes enormous networks that may be impra...
research
05/18/2020

Cross-filter compression for CNN inference acceleration

Convolution neural network demonstrates great capability for multiple ta...
research
01/15/2019

URNet : User-Resizable Residual Networks with Conditional Gating Module

Convolutional Neural Networks are widely used to process spatial scenes,...
research
07/15/2017

Ternary Residual Networks

Sub-8-bit representation of DNNs incur some discernible loss of accuracy...
research
12/19/2021

Elastic-Link for Binarized Neural Network

Recent work has shown that Binarized Neural Networks (BNNs) are able to ...
research
10/01/2019

NESTA: Hamming Weight Compression-Based Neural Proc. Engine

In this paper, we present NESTA, a specialized Neural engine that signif...

Please sign up or login with your details

Forgot password? Click here to reset