Training of Deep Neural Networks based on Distance Measures using RMSProp

08/06/2017
by   Thomas Kurbiel, et al.
0

The vanishing gradient problem was a major obstacle for the success of deep learning. In recent years it was gradually alleviated through multiple different techniques. However the problem was not really overcome in a fundamental way, since it is inherent to neural networks with activation functions based on dot products. In a series of papers, we are going to analyze alternative neural network structures which are not based on dot products. In this first paper, we revisit neural networks built up of layers based on distance measures and Gaussian activation functions. These kinds of networks were only sparsely used in the past since they are hard to train when using plain stochastic gradient descent methods. We show that by using Root Mean Square Propagation (RMSProp) it is possible to efficiently learn multi-layer neural networks. Furthermore we show that when appropriately initialized these kinds of neural networks suffer much less from the vanishing and exploding gradient problem than traditional neural networks even for deep networks.

READ FULL TEXT
research
05/03/2015

Highway Networks

There is plenty of theoretical and empirical evidence that depth of neur...
research
06/22/2020

Bidirectional Self-Normalizing Neural Networks

The problem of exploding and vanishing gradients has been a long-standin...
research
11/21/2019

Volume-preserving Neural Networks: A Solution to the Vanishing Gradient Problem

We propose a novel approach to addressing the vanishing (or exploding) g...
research
06/04/2021

Regularization and Reparameterization Avoid Vanishing Gradients in Sigmoid-Type Networks

Deep learning requires several design choices, such as the nodes' activa...
research
06/13/2019

Associated Learning: Decomposing End-to-end Backpropagation based on Auto-encoders and Target Propagation

Backpropagation has been widely used in deep learning approaches, but it...
research
05/01/2021

Stochastic Block-ADMM for Training Deep Networks

In this paper, we propose Stochastic Block-ADMM as an approach to train ...
research
06/18/2021

The Principles of Deep Learning Theory

This book develops an effective theory approach to understanding deep ne...

Please sign up or login with your details

Forgot password? Click here to reset