DeepAI
Log In Sign Up

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

12/15/2017
by   Benoit Jacob, et al.
0

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/21/2018

Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines

Deep learning as a means to inferencing has proliferated thanks to its v...
02/15/2019

AutoQB: AutoML for Network Quantization and Binarization on Mobile Devices

In this paper, we propose a hierarchical deep reinforcement learning (DR...
06/21/2020

Efficient Integer-Arithmetic-Only Convolutional Neural Networks

Integer-arithmetic-only networks have been demonstrated effective to red...
04/03/2019

Progressive Stochastic Binarization of Deep Networks

A plethora of recent research has focused on improving the memory footpr...
04/21/2020

A Data and Compute Efficient Design for Limited-Resources Deep Learning

Thanks to their improved data efficiency, equivariant neural networks ha...
02/08/2022

Binary Neural Networks as a general-propose compute paradigm for on-device computer vision

For binary neural networks (BNNs) to become the mainstream on-device com...
08/21/2021

Integer-arithmetic-only Certified Robustness for Quantized Neural Networks

Adversarial data examples have drawn significant attention from the mach...