SPIQ: Data-Free Per-Channel Static Input Quantization

03/28/2022
by   Edouard Yvinec, et al.
0

Computationally expensive neural networks are ubiquitous in computer vision and solutions for efficient inference have drawn a growing attention in the machine learning community. Examples of such solutions comprise quantization, i.e. converting the processing values (weights and inputs) from floating point into integers e.g. int8 or int4. Concurrently, the rise of privacy concerns motivated the study of less invasive acceleration methods, such as data-free quantization of pre-trained models weights and activations. Previous approaches either exploit statistical information to deduce scalar ranges and scaling factors for the activations in a static manner, or dynamically adapt this range on-the-fly for each input of each layers (also referred to as activations): the latter generally being more accurate at the expanse of significantly slower inference. In this work, we argue that static input quantization can reach the accuracy levels of dynamic methods by means of a per-channel input quantization scheme that allows one to more finely preserve cross-channel dynamics. We show through a thorough empirical evaluation on multiple computer vision problems (e.g. ImageNet classification, Pascal VOC object detection as well as CityScapes semantic segmentation) that the proposed method, dubbed SPIQ, achieves accuracies rivalling dynamic approaches with static-level inference speed, significantly outperforming state-of-the-art quantization methods on every benchmark.

READ FULL TEXT
research
03/28/2022

REx: Data-Free Residual Quantization Error Expansion

Deep neural networks (DNNs) are nowadays ubiquitous in the computer visi...
research
07/04/2023

Learning Discrete Weights and Activations Using the Local Reparameterization Trick

In computer vision and machine learning, a crucial challenge is to lower...
research
09/14/2022

Analysis of Quantization on MLP-based Vision Models

Quantization is wildly taken as a model compression technique, which obt...
research
12/06/2021

A Generalized Zero-Shot Quantization of Deep Convolutional Neural Networks via Learned Weights Statistics

Quantizing the floating-point weights and activations of deep convolutio...
research
01/24/2023

PowerQuant: Automorphism Search for Non-Uniform Quantization

Deep neural networks (DNNs) are nowadays ubiquitous in many domains such...
research
05/29/2018

Channel Gating Neural Networks

Employing deep neural networks to obtain state-of-the-art performance on...
research
01/03/2020

Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference

While increasingly deep networks are still in general desired for achiev...

Please sign up or login with your details

Forgot password? Click here to reset