Activated Gradients for Deep Neural Networks

07/09/2021
by   Mei Liu, et al.
0

Deep neural networks often suffer from poor performance or even training failure due to the ill-conditioned problem, the vanishing/exploding gradient problem, and the saddle point problem. In this paper, a novel method by acting the gradient activation function (GAF) on the gradient is proposed to handle these challenges. Intuitively, the GAF enlarges the tiny gradients and restricts the large gradient. Theoretically, this paper gives conditions that the GAF needs to meet, and on this basis, proves that the GAF alleviates the problems mentioned above. In addition, this paper proves that the convergence rate of SGD with the GAF is faster than that without the GAF under some assumptions. Furthermore, experiments on CIFAR, ImageNet, and PASCAL visual object classes confirm the GAF's effectiveness. The experimental results also demonstrate that the proposed method is able to be adopted in various deep neural networks to improve their performance. The source code is publicly available at https://github.com/LongJin-lab/Activated-Gradients-for-Deep-Neural-Networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2019

signADAM: Learning Confidences for Deep Neural Networks

In this paper, we propose a new first-order gradient-based algorithm to ...
research
08/13/2016

SGDR: Stochastic Gradient Descent with Warm Restarts

Restart techniques are common in gradient-free optimization to deal with...
research
09/13/2020

Understanding Boolean Function Learnability on Deep Neural Networks

Computational learning theory states that many classes of boolean formul...
research
09/12/2019

diffGrad: An Optimization Method for Convolutional Neural Networks

Stochastic Gradient Decent (SGD) is one of the core techniques behind th...
research
09/07/2021

Tom: Leveraging trend of the observed gradients for faster convergence

The success of deep learning can be attributed to various factors such a...
research
08/11/2023

Enhancing Generalization of Universal Adversarial Perturbation through Gradient Aggregation

Deep neural networks are vulnerable to universal adversarial perturbatio...
research
05/29/2023

Intelligent gradient amplification for deep neural networks

Deep learning models offer superior performance compared to other machin...

Please sign up or login with your details

Forgot password? Click here to reset