Quantization-Guided Training for Compact TinyML Models

by   Sedigh Ghamari, et al.

We propose a Quantization Guided Training (QGT) method to guide DNN training towards optimized low-bit-precision targets and reach extreme compression levels below 8-bit precision. Unlike standard quantization-aware training (QAT) approaches, QGT uses customized regularization to encourage weight values towards a distribution that maximizes accuracy while reducing quantization errors. One of the main benefits of this approach is the ability to identify compression bottlenecks. We validate QGT using state-of-the-art model architectures on vision datasets. We also demonstrate the effectiveness of QGT with an 81KB tiny model for person detection down to 2-bit precision (representing 17.7x size reduction), while maintaining an accuracy drop of only 3


Bit Efficient Quantization for Deep Neural Networks

Quantization for deep neural networks have afforded models for edge devi...

Cluster Regularized Quantization for Deep Networks Compression

Deep neural networks (DNNs) have achieved great success in a wide range ...

R^2: Range Regularization for Model Compression and Quantization

Model parameter regularization is a widely used technique to improve gen...

Guetzli: Perceptually Guided JPEG Encoder

Guetzli is a new JPEG encoder that aims to produce visually indistinguis...

QReg: On Regularization Effects of Quantization

In this paper we study the effects of quantization in DNN training. We h...

Training with Quantization Noise for Extreme Fixed-Point Compression

We tackle the problem of producing compact models, maximizing their accu...

Partitioning-Guided K-Means: Extreme Empty Cluster Resolution for Extreme Model Compression

Compactness in deep learning can be critical to a model's viability in l...

Please sign up or login with your details

Forgot password? Click here to reset