FIT: A Metric for Model Sensitivity

10/16/2022
by   Ben Zandonati, et al.
0

Model compression is vital to the deployment of deep learning on edge devices. Low precision representations, achieved via quantization of weights and activations, can reduce inference time and memory requirements. However, quantifying and predicting the response of a model to the changes associated with this procedure remains challenging. This response is non-linear and heterogeneous throughout the network. Understanding which groups of parameters and activations are more sensitive to quantization than others is a critical stage in maximizing efficiency. For this purpose, we propose FIT. Motivated by an information geometric perspective, FIT combines the Fisher information with a model of quantization. We find that FIT can estimate the final performance of a network without retraining. FIT effectively fuses contributions from both parameter and activation quantization into a single metric. Additionally, FIT is fast to compute when compared to existing methods, demonstrating favourable convergence properties. These properties are validated experimentally across hundreds of quantization configurations, with a focus on layer-wise mixed-precision quantization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2022

BiTAT: Neural Network Binarization with Task-dependent Aggregated Transformation

Neural network quantization aims to transform high-precision weights and...
research
08/05/2019

GDRQ: Group-based Distribution Reshaping for Quantization

Low-bit quantization is challenging to maintain high performance with li...
research
08/25/2022

Efficient Activation Quantization via Adaptive Rounding Border for Post-Training Quantization

Post-training quantization (PTQ) attracts increasing attention due to it...
research
09/16/2020

MSP: An FPGA-Specific Mixed-Scheme, Multi-Precision Deep Neural Network Quantization Framework

With the tremendous success of deep learning, there exists imminent need...
research
12/20/2022

CSMPQ:Class Separability Based Mixed-Precision Quantization

Mixed-precision quantization has received increasing attention for its c...
research
10/03/2018

Relaxed Quantization for Discretized Neural Networks

Neural network quantization has become an important research area due to...
research
11/13/2019

DupNet: Towards Very Tiny Quantized CNN with Improved Accuracy for Face Detection

Deploying deep learning based face detectors on edge devices is a challe...

Please sign up or login with your details

Forgot password? Click here to reset