Progressive Stochastic Binarization of Deep Networks

04/03/2019
by   David Hartmann, et al.
0

A plethora of recent research has focused on improving the memory footprint and inference speed of deep networks by reducing the complexity of (i) numerical representations (for example, by deterministic or stochastic quantization) and (ii) arithmetic operations (for example, by binarization of weights). We propose a stochastic binarization scheme for deep networks that allows for efficient inference on hardware by restricting itself to additions of small integers and fixed shifts. Unlike previous approaches, the underlying randomized approximation is progressive, thus permitting an adaptive control of the accuracy of each operation at run-time. In a low-precision setting, we match the accuracy of previous binarized approaches. Our representation is unbiased - it approaches continuous computation with increasing sample size. In a high-precision regime, the computational costs are competitive with previous quantization schemes. Progressive stochastic binarization also permits localized, dynamic accuracy control within a single network, thereby providing a new tool for adaptively focusing computational attention. We evaluate our method on networks of various architectures, already pretrained on ImageNet. With representational costs comparable to previous schemes, we obtain accuracies close to the original floating point implementation. This includes pruned networks, except the known special case of certain types of separated convolutions. By focusing computational attention using progressive sampling, we reduce inference costs on ImageNet further by a factor of up to 33

READ FULL TEXT
research
12/15/2017

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

The rising popularity of intelligent mobile devices and the daunting com...
research
05/21/2018

Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines

Deep learning as a means to inferencing has proliferated thanks to its v...
research
05/14/2020

A new WENO-2r algorithm with progressive order of accuracy close to discontinuities

In this article we present a modification of the algorithm for data disc...
research
08/10/2019

Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations

This paper tackles the problem of training a deep convolutional neural n...
research
04/10/2017

WRPN: Training and Inference using Wide Reduced-Precision Networks

For computer vision applications, prior works have shown the efficacy of...
research
03/25/2021

A Survey of Quantization Methods for Efficient Neural Network Inference

As soon as abstract mathematical computations were adapted to computatio...
research
09/25/2020

Randomized Progressive Hedging methods for Multi-stage Stochastic Programming

Progressive Hedging is a popular decomposition algorithm for solving mul...

Please sign up or login with your details

Forgot password? Click here to reset