Gradient Amplification: An efficient way to train deep neural networks

06/16/2020
by   Sunitha Basodi, et al.
0

Improving performance of deep learning models and reducing their training times are ongoing challenges in deep neural networks. There are several approaches proposed to address these challenges one of which is to increase the depth of the neural networks. Such deeper networks not only increase training times, but also suffer from vanishing gradients problem while training. In this work, we propose gradient amplification approach for training deep learning models to prevent vanishing gradients and also develop a training strategy to enable or disable gradient amplification method across several epochs with different learning rates. We perform experiments on VGG-19 and resnet (Resnet-18 and Resnet-34) models, and study the impact of amplification parameters on these models in detail. Our proposed approach improves performance of these deep learning models even at higher learning rates, thereby allowing these models to achieve higher performance with reduced training time.

READ FULL TEXT
research
05/29/2023

Intelligent gradient amplification for deep neural networks

Deep learning models offer superior performance compared to other machin...
research
10/22/2019

Vanishing Nodes: Another Phenomenon That Makes Training Deep Neural Networks Difficult

It is well known that the problem of vanishing/exploding gradients is a ...
research
08/09/2023

A Novel Method for improving accuracy in neural network by reinstating traditional back propagation technique

Deep learning has revolutionized industries like computer vision, natura...
research
02/12/2021

Cockpit: A Practical Debugging Tool for Training Deep Neural Networks

When engineers train deep learning models, they are very much "flying bl...
research
10/01/2018

Elastic Neural Networks for Classification

In this work we propose a framework for improving the performance of any...
research
09/01/2020

Training Deep Neural Networks with Constrained Learning Parameters

Today's deep learning models are primarily trained on CPUs and GPUs. Alt...
research
10/23/2020

ResNet or DenseNet? Introducing Dense Shortcuts to ResNet

ResNet or DenseNet? Nowadays, most deep learning based approaches are im...

Please sign up or login with your details

Forgot password? Click here to reset