PokeBNN: A Binary Pursuit of Lightweight Accuracy

by   Yichi Zhang, et al.

Top-1 ImageNet optimization promotes enormous networks that may be impractical in inference settings. Binary neural networks (BNNs) have the potential to significantly lower the compute intensity but existing models suffer from low quality. To overcome this deficiency, we propose PokeConv, a binary convolution block which improves quality of BNNs by techniques such as adding multiple residual paths, and tuning the activation function. We apply it to ResNet-50 and optimize ResNet's initial convolutional layer which is hard to binarize. We name the resulting network family PokeBNN. These techniques are chosen to yield favorable improvements in both top-1 accuracy and the network's cost. In order to enable joint optimization of the cost together with accuracy, we define arithmetic computation effort (ACE), a hardware- and energy-inspired cost metric for quantized and binarized networks. We also identify a need to optimize an under-explored hyper-parameter controlling the binarization gradient approximation. We establish a new, strong state-of-the-art (SOTA) on top-1 accuracy together with commonly-used CPU64 cost, ACE cost and network size metrics. ReActNet-Adam, the previous SOTA in BNNs, achieved a 70.5 7.9 ACE. A small variant of PokeBNN achieves 70.5 than 3x reduction in cost; a larger PokeBNN achieves 75.6 more than 5 implementation in JAX/Flax and reproduction instructions are open sourced.


page 1

page 2

page 3

page 4


Optimize Deep Convolutional Neural Network with Ternarized Weights and High Accuracy

Deep convolution neural network has achieved great success in many artif...

BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations

Binary Neural Networks (BNNs) have been garnering interest thanks to the...

MoBiNet: A Mobile Binary Network for Image Classification

MobileNet and Binary Neural Networks are two among the most widely used ...

DIET-SNN: Direct Input Encoding With Leakage and Threshold Optimization in Deep Spiking Neural Networks

Bio-inspired spiking neural networks (SNNs), operating with asynchronous...

Pareto-Optimal Quantized ResNet Is Mostly 4-bit

Quantization has become a popular technique to compress neural networks ...

INSTA-BNN: Binary Neural Network with INSTAnce-aware Threshold

Binary Neural Networks (BNNs) have emerged as a promising solution for r...

Soft Conditional Computation

Conditional computation aims to increase the size and accuracy of a netw...

Please sign up or login with your details

Forgot password? Click here to reset