Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

12/15/2017
by   Benoit Jacob, et al.
0

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2018

Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines

Deep learning as a means to inferencing has proliferated thanks to its v...
research
02/15/2019

AutoQB: AutoML for Network Quantization and Binarization on Mobile Devices

In this paper, we propose a hierarchical deep reinforcement learning (DR...
research
06/21/2020

Efficient Integer-Arithmetic-Only Convolutional Neural Networks

Integer-arithmetic-only networks have been demonstrated effective to red...
research
04/03/2019

Progressive Stochastic Binarization of Deep Networks

A plethora of recent research has focused on improving the memory footpr...
research
04/21/2020

A Data and Compute Efficient Design for Limited-Resources Deep Learning

Thanks to their improved data efficiency, equivariant neural networks ha...
research
02/08/2022

Binary Neural Networks as a general-propose compute paradigm for on-device computer vision

For binary neural networks (BNNs) to become the mainstream on-device com...
research
08/21/2021

Integer-arithmetic-only Certified Robustness for Quantized Neural Networks

Adversarial data examples have drawn significant attention from the mach...

Please sign up or login with your details

Forgot password? Click here to reset