Softmax Gradient Tampering: Decoupling the Backward Pass for Improved Fitting

11/24/2021
by   Bishshoy Das, et al.
0

We introduce Softmax Gradient Tampering, a technique for modifying the gradients in the backward pass of neural networks in order to enhance their accuracy. Our approach transforms the predicted probability values using a power-based probability transformation and then recomputes the gradients in the backward pass. This modification results in a smoother gradient profile, which we demonstrate empirically and theoretically. We do a grid search for the transform parameters on residual networks. We demonstrate that modifying the softmax gradients in ConvNets may result in increased training accuracy, thus increasing the fit across the training data and maximally utilizing the learning capacity of neural networks. We get better test metrics and lower generalization gaps when combined with regularization techniques such as label smoothing. Softmax gradient tampering improves ResNet-50's test accuracy by 0.52% over the baseline on the ImageNet dataset. Our approach is very generic and may be used across a wide range of different network architectures and datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2023

MetaGrad: Adaptive Gradient Quantization with Hypernetworks

A popular track of network compression approach is Quantization aware Tr...
research
06/17/2021

Backward Gradient Normalization in Deep Neural Networks

We introduce a new technique for gradient normalization during neural ne...
research
06/30/2020

Data-driven Regularization via Racecar Training for Generalizing Neural Networks

We propose a novel training approach for improving the generalization in...
research
02/02/2023

Energy Efficient Training of SNN using Local Zeroth Order Method

Spiking neural networks are becoming increasingly popular for their low ...
research
06/18/2021

Differentiable Particle Filtering without Modifying the Forward Pass

In recent years particle filters have being used as components in system...
research
04/08/2021

Robust Differentiable SVD

Eigendecomposition of symmetric matrices is at the heart of many compute...

Please sign up or login with your details

Forgot password? Click here to reset