Data-free mixed-precision quantization using novel sensitivity metric

03/18/2021
by   Donghyun Lee, et al.
0

Post-training quantization is a representative technique for compressing neural networks, making them smaller and more efficient for deployment on edge devices. However, an inaccessible user dataset often makes it difficult to ensure the quality of the quantized neural network in practice. In addition, existing approaches may use a single uniform bit-width across the network, resulting in significant accuracy degradation at extremely low bit-widths. To utilize multiple bit-width, sensitivity metric plays a key role in balancing accuracy and compression. In this paper, we propose a novel sensitivity metric that considers the effect of quantization error on task loss and interaction with other layers. Moreover, we develop labeled data generation methods that are not dependent on a specific operation of the neural network. Our experiments show that the proposed metric better represents quantization sensitivity, and generated data are more feasible to be applied to mixed-precision quantization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/06/2018

Deep Neural Network Compression with Single and Multiple Level Quantization

Network quantization is an effective solution to compress deep neural ne...
research
10/31/2022

Block-Wise Dynamic-Precision Neural Network Training Acceleration via Online Quantization Sensitivity Analytics

Data quantization is an effective method to accelerate neural network tr...
research
04/19/2023

Post-Training Quantization for Object Detection

Efficient inference for object detection networks is a major challenge o...
research
07/11/2023

Mixed-Precision Quantization with Cross-Layer Dependencies

Quantization is commonly used to compress and accelerate deep neural net...
research
12/21/2022

Automatic Network Adaptation for Ultra-Low Uniform-Precision Quantization

Uniform-precision neural network quantization has gained popularity sinc...
research
08/22/2020

One Weight Bitwidth to Rule Them All

Weight quantization for deep ConvNets has shown promising results for ap...
research
09/16/2021

OMPQ: Orthogonal Mixed Precision Quantization

To bridge the ever increasing gap between deep neural networks' complexi...

Please sign up or login with your details

Forgot password? Click here to reset