Vanishing Nodes: Another Phenomenon That Makes Training Deep Neural Networks Difficult

10/22/2019
by   Wen-Yu Chang, et al.
0

It is well known that the problem of vanishing/exploding gradients is a challenge when training deep networks. In this paper, we describe another phenomenon, called vanishing nodes, that also increases the difficulty of training deep neural networks. As the depth of a neural network increases, the network's hidden nodes have more highly correlated behavior. This results in great similarities between these nodes. The redundancy of hidden nodes thus increases as the network becomes deeper. We call this problem vanishing nodes, and we propose the metric vanishing node indicator (VNI) for quantitatively measuring the degree of vanishing nodes. The VNI can be characterized by the network parameters, which is shown analytically to be proportional to the depth of the network and inversely proportional to the network width. The theoretical results show that the effective number of nodes vanishes to one when the VNI increases to one (its maximal value), and that vanishing/exploding gradients and vanishing nodes are two different challenges that increase the difficulty of training deep neural networks. The numerical results from the experiments suggest that the degree of vanishing nodes will become more evident during back-propagation training, and that when the VNI is equal to 1, the network cannot learn simple tasks (e.g. the XOR problem) even when the gradients are neither vanishing nor exploding. We refer to this kind of gradients as the walking dead gradients, which cannot help the network converge when having a relatively large enough scale. Finally, the experiments show that the likelihood of failed training increases as the depth of the network increases. The training will become much more difficult due to the lack of network representation capability.

READ FULL TEXT

page 7

page 10

research
06/16/2020

Gradient Amplification: An efficient way to train deep neural networks

Improving performance of deep learning models and reducing their trainin...
research
08/09/2023

A Novel Method for improving accuracy in neural network by reinstating traditional back propagation technique

Deep learning has revolutionized industries like computer vision, natura...
research
05/29/2023

Intelligent gradient amplification for deep neural networks

Deep learning models offer superior performance compared to other machin...
research
11/14/2015

Efficient Training of Very Deep Neural Networks for Supervised Hashing

In this paper, we propose training very deep neural networks (DNNs) for ...
research
06/04/2021

Regularization and Reparameterization Avoid Vanishing Gradients in Sigmoid-Type Networks

Deep learning requires several design choices, such as the nodes' activa...
research
05/01/2021

Stochastic Block-ADMM for Training Deep Networks

In this paper, we propose Stochastic Block-ADMM as an approach to train ...
research
07/07/2020

Doubly infinite residual networks: a diffusion process approach

When neural network's parameters are initialized as i.i.d., neural netwo...

Please sign up or login with your details

Forgot password? Click here to reset