Training Quantized Network with Auxiliary Gradient Module

03/27/2019
by   Bohan Zhuang, et al.
0

In this paper, we seek to tackle two challenges in training low-precision networks: 1) the notorious difficulty in propagating gradient through a low-precision network due to the non-differentiable quantization function; 2) the requirement of a full-precision realization of skip connections in residual type network architectures. During training, we introduce an auxiliary gradient module which mimics the effect of skip connections to assist the optimization. We then expand the original low-precision network with the full-precision auxiliary gradient module to formulate a mixed-precision residual network and optimize it jointly with the low-precision model using weight sharing and separate batch normalization. This strategy ensures that the gradient back-propagates more easily, thus alleviating a major difficulty in training low-precision networks. Moreover, we find that when training a low-precision plain network with our method, the plain network can achieve performance similar to its counterpart with residual skip connections; i.e. the plain network without floating-point skip connections is just as effective to deploy at inference time. To further promote the gradient flow during backpropagation, we then employ a stochastic structured precision strategy to stochastically sample and quantize sub-networks while keeping other parts full-precision. We evaluate the proposed method on the image classification task over various quantization approaches and show consistent performance increases.

READ FULL TEXT
research
02/02/2021

Hardware-efficient Residual Networks for FPGAs

Residual networks (ResNets) employ skip connections in their networks – ...
research
08/10/2019

Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations

This paper tackles the problem of training a deep convolutional neural n...
research
10/06/2021

SIRe-Networks: Skip Connections over Interlaced Multi-Task Learning and Residual Connections for Structure Preserving Object Classification

Improving existing neural network architectures can involve several desi...
research
09/05/2019

Training Compact Neural Networks via Auxiliary Overparameterization

It is observed that overparameterization (i.e., designing neural network...
research
11/01/2017

Towards Effective Low-bitwidth Convolutional Neural Networks

This paper tackles the problem of training a deep convolutional neural n...
research
10/02/2016

Accelerating Deep Convolutional Networks using low-precision and sparsity

We explore techniques to significantly improve the compute efficiency an...
research
11/27/2022

A Kernel Perspective of Skip Connections in Convolutional Networks

Over-parameterized residual networks (ResNets) are amongst the most succ...

Please sign up or login with your details

Forgot password? Click here to reset