Build a Compact Binary Neural Network through Bit-level Sensitivity and Data Pruning

02/03/2018
by   Yixing Li, et al.
0

Convolutional neural network (CNN) has been widely used for vision-based tasks. Due to the high computational complexity and memory storage requirement, it is hard to directly deploy a full-precision CNN on embedded devices. The hardware-friendly designs are needed for re-source-limited and energy-constrained embed-ded devices. Emerging solutions are adopted for the neural network compression, e.g., bina-ry/ternary weight network, pruned network and quantized network. Among them, Binarized Neural Network (BNN) is believed to be the most hardware-friendly framework due to its small network size and low computational com-plexity. No existing work has further shrunk the size of BNN. In this work, we explore the redun-dancy in BNN and build a compact BNN (CBNN) based on the bit-level sensitivity analy-sis and bit-level data pruning. The input data is converted to a high dimensional bit-sliced for-mat. In post-training stage, we analyze the im-pact of different bit slices to the accuracy. By pruning the redundant input bit slices and shrinking the network size, we are able to build a more compact BNN. Our result shows that we can further scale down the network size of the BNN up to 3.9x with no more than 1 with the baseline BNN and its full-precision counterpart, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/30/2021

Post-training deep neural network pruning via layer-wise calibration

We present a post-training weight pruning method for deep neural network...
research
03/23/2019

BitSplit-Net: Multi-bit Deep Neural Network with Bitwise Activation Function

Significant computational cost and memory requirements for deep neural n...
research
01/25/2022

Bit-serial Weight Pools: Compression and Arbitrary Precision Execution of Neural Networks on Resource Constrained Processors

Applications of neural networks on edge systems have proliferated in rec...
research
01/12/2021

Activation Density based Mixed-Precision Quantization for Energy Efficient Neural Networks

As neural networks gain widespread adoption in embedded devices, there i...
research
11/19/2018

Three Dimensional Convolutional Neural Network Pruning with Regularization-Based Method

In recent years, three-dimensional convolutional neural network (3D CNN)...
research
11/01/2018

Hybrid Pruning: Thinner Sparse Networks for Fast Inference on Edge Devices

We introduce hybrid pruning which combines both coarse-grained channel a...
research
09/22/2021

High-dimensional Bayesian Optimization for CNN Auto Pruning with Clustering and Rollback

Pruning has been widely used to slim convolutional neural network (CNN) ...

Please sign up or login with your details

Forgot password? Click here to reset