Hebbian-Descent

05/25/2019
by   Jan Melchior, et al.
0

In this work we propose Hebbian-descent as a biologically plausible learning rule for hetero-associative as well as auto-associative learning in single layer artificial neural networks. It can be used as a replacement for gradient descent as well as Hebbian learning, in particular in online learning, as it inherits their advantages while not suffering from their disadvantages. We discuss the drawbacks of Hebbian learning as having problems with correlated input data and not profiting from seeing training patterns several times. For gradient descent we identify the derivative of the activation function as problematic especially in online learning. Hebbian-descent addresses these problems by getting rid of the activation function's derivative and by centering, i.e. keeping the neural activities mean free, leading to a biologically plausible update rule that is provably convergent, does not suffer from the vanishing error term problem, can deal with correlated data, profits from seeing patterns several times, and enables successful online learning when centering is used. We discuss its relationship to Hebbian learning, contrastive learning, and gradient decent and show that in case of a strictly positive derivative of the activation function Hebbian-descent leads to the same update rule as gradient descent but for a different loss function. In this case Hebbian-descent inherits the convergence properties of gradient descent, but we also show empirically that it converges when the derivative of the activation function is only non-negative, such as for the step function for example. Furthermore, in case of the mean squared error loss Hebbian-descent can be understood as the difference between two Hebb-learning steps, which in case of an invertible and integrable activation function actually optimizes a generalized linear model. ...

READ FULL TEXT
research
10/14/2018

Variational Neural Networks: Every Layer and Neuron Can Be Unique

The choice of activation function can significantly influence the perfor...
research
01/15/2023

Empirical study of the modulus as activation function in computer vision applications

In this work we propose a new non-monotonic activation function: the mod...
research
11/13/2019

Quadratic number of nodes is sufficient to learn a dataset via gradient descent

We prove that if an activation function satisfies some mild conditions a...
research
08/10/2022

Frequency propagation: Multi-mechanism learning in nonlinear physical networks

We introduce frequency propagation, a learning algorithm for nonlinear p...
research
05/02/2021

Universal scaling laws in the gradient descent training of neural networks

Current theoretical results on optimization trajectories of neural netwo...
research
03/13/2019

Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets

Training activation quantized neural networks involves minimizing a piec...
research
05/07/2018

Polynomial Convergence of Gradient Descent for Training One-Hidden-Layer Neural Networks

We analyze Gradient Descent applied to learning a bounded target functio...

Please sign up or login with your details

Forgot password? Click here to reset