Gradient-Based Post-Training Quantization: Challenging the Status Quo

08/15/2023
by   Edouard Yvinec, et al.
0

Quantization has become a crucial step for the efficient deployment of deep neural networks, where floating point operations are converted to simpler fixed point operations. In its most naive form, it simply consists in a combination of scaling and rounding transformations, leading to either a limited compression rate or a significant accuracy drop. Recently, Gradient-based post-training quantization (GPTQ) methods appears to be constitute a suitable trade-off between such simple methods and more powerful, yet expensive Quantization-Aware Training (QAT) approaches, particularly when attempting to quantize LLMs, where scalability of the quantization process is of paramount importance. GPTQ essentially consists in learning the rounding operation using a small calibration set. In this work, we challenge common choices in GPTQ methods. In particular, we show that the process is, to a certain extent, robust to a number of variables (weight selection, feature augmentation, choice of calibration set). More importantly, we derive a number of best practices for designing more efficient and scalable GPTQ methods, regarding the problem formulation (loss, degrees of freedom, use of non-uniform quantization schemes) or optimization process (choice of variable and optimizer). Lastly, we propose a novel importance-based mixed-precision technique. Those guidelines lead to significant performance improvements on all the tested state-of-the-art GPTQ methods and networks (e.g. +6.819 points on ViT for 4-bit quantization), paving the way for the design of scalable, yet effective quantization methods.

READ FULL TEXT
research
07/16/2019

Learning Multimodal Fixed-Point Weights using Gradient Descent

Due to their high computational complexity, deep neural networks are sti...
research
08/10/2023

NUPES : Non-Uniform Post-Training Quantization via Power Exponent Search

Deep neural network (DNN) deployment has been confined to larger hardwar...
research
12/05/2022

QFT: Post-training quantization via fast joint finetuning of all degrees of freedom

The post-training quantization (PTQ) challenge of bringing quantized neu...
research
03/19/2019

Trained Uniform Quantization for Accurate and Efficient Neural Network Inference on Fixed-Point Hardware

We propose a method of training quantization clipping thresholds for uni...
research
04/22/2020

Up or Down? Adaptive Rounding for Post-Training Quantization

When quantizing neural networks, assigning each floating-point weight to...
research
11/23/2021

HERO: Hessian-Enhanced Robust Optimization for Unifying and Improving Generalization and Quantization Performance

With the recent demand of deploying neural network models on mobile and ...
research
09/01/2022

On Quantizing Implicit Neural Representations

The role of quantization within implicit/coordinate neural networks is s...

Please sign up or login with your details

Forgot password? Click here to reset