Binary Neural Networks as a general-propose compute paradigm for on-device computer vision

02/08/2022
by   Guhong Nie, et al.
0

For binary neural networks (BNNs) to become the mainstream on-device computer vision algorithm, they must achieve a superior speed-vs-accuracy tradeoff than 8-bit quantization and establish a similar degree of general applicability in vision tasks. To this end, we propose a BNN framework comprising 1) a minimalistic inference scheme for hardware-friendliness, 2) an over-parameterized training scheme for high accuracy, and 3) a simple procedure to adapt to different vision tasks. The resultant framework overtakes 8-bit quantization in the speed-vs-accuracy tradeoff for classification, detection, segmentation, super-resolution and matching: our BNNs not only retain the accuracy levels of their 8-bit baselines but also showcase 1.3-2.4× faster FPS on mobile CPUs. Similar conclusions can be drawn for prototypical systolic-array-based AI accelerators, where our BNNs promise 2.8-7× fewer execution cycles than 8-bit and 2.1-2.7× fewer cycles than alternative BNN designs. These results suggest that the time for large-scale BNN adoption could be upon us.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2017

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

The rising popularity of intelligent mobile devices and the daunting com...
research
02/10/2023

A Practical Mixed Precision Algorithm for Post-Training Quantization

Neural network quantization is frequently used to optimize model size, l...
research
06/11/2019

Data-Free Quantization through Weight Equalization and Bias Correction

We introduce a data-free quantization method for deep neural networks th...
research
08/05/2019

GDRQ: Group-based Distribution Reshaping for Quantization

Low-bit quantization is challenging to maintain high performance with li...
research
08/25/2023

A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance

We present accumulator-aware quantization (A2Q), a novel weight quantiza...
research
12/16/2022

RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers

Post-training quantization (PTQ), which only requires a tiny dataset for...
research
05/24/2023

RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models

With the rapid increase in the size of neural networks, model compressio...

Please sign up or login with your details

Forgot password? Click here to reset