Recurrence of Optimum for Training Weight and Activation Quantized Networks

12/10/2020
by   Ziang Long, et al.
0

Deep neural networks (DNNs) are quantized for efficient inference on resource-constrained platforms. However, training deep learning models with low-precision weights and activations involves a demanding optimization task, which calls for minimizing a stage-wise loss function subject to a discrete set-constraint. While numerous training methods have been proposed, existing studies for full quantization of DNNs are mostly empirical. From a theoretical point of view, we study practical techniques for overcoming the combinatorial nature of network quantization. Specifically, we investigate a simple yet powerful projected gradient-like algorithm for quantizing two-linear-layer networks, which proceeds by repeatedly moving one step at float weights in the negation of a heuristic fake gradient of the loss function (so-called coarse gradient) evaluated at quantized weights. For the first time, we prove that under mild conditions, the sequence of quantized weights recurrently visits the global optimum of the discrete minimization problem for training fully quantized network. We also show numerical evidence of the recurrence phenomenon of weight evolution in training quantized deep networks.

READ FULL TEXT

page 8

page 9

research
08/15/2018

Blended Coarse Gradient Descent for Full Quantization of Deep Neural Networks

Quantized deep neural networks (QDNNs) are attractive due to their much ...
research
01/19/2018

BinaryRelax: A Relaxation Approach For Training Deep Neural Networks With Quantized Weights

We propose BinaryRelax, a simple two-phase algorithm, for training deep ...
research
11/23/2020

Learning Quantized Neural Nets by Coarse Gradient Method for Non-linear Classification

Quantized or low-bit neural networks are attractive due to their inferen...
research
02/05/2023

Quantized Distributed Training of Large Models with Convergence Guarantees

Communication-reduction techniques are a popular way to improve scalabil...
research
11/07/2022

AskewSGD : An Annealed interval-constrained Optimisation method to train Quantized Neural Networks

In this paper, we develop a new algorithm, Annealed Skewed SGD - AskewSG...
research
12/06/2022

QEBVerif: Quantization Error Bound Verification of Neural Networks

While deep neural networks (DNNs) have demonstrated impressive performan...
research
06/07/2016

Deep neural networks are robust to weight binarization and other non-linear distortions

Recent results show that deep neural networks achieve excellent performa...

Please sign up or login with your details

Forgot password? Click here to reset