Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

01/15/2013
by   Tommi Vatanen, et al.
0

Recently, we proposed to transform the outputs of each hidden neuron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. We continue the work by firstly introducing a third transformation to normalize the scale of the outputs of each hidden neuron, and secondly by analyzing the connections to second order optimization methods. We show that the transformations make a simple stochastic gradient behave closer to second-order optimization methods and thus speed up learning. This is shown both in theory and with experiments. The experiments on the third transformation show that while it further increases the speed of learning, it can also hurt performance by converging to a worse local optimum, where both the inputs and outputs of many hidden neurons are close to zero.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2021

Exact Stochastic Second Order Deep Learning

Optimization in Deep Learning is mainly dominated by first-order methods...
research
01/21/2022

Training Hybrid Classical-Quantum Classifiers via Stochastic Variational Optimization

Quantum machine learning has emerged as a potential practical applicatio...
research
07/31/2020

LSOS: Line-search Second-Order Stochastic optimization methods

We develop a line-search second-order algorithmic framework for optimiza...
research
10/01/2021

Bilevel stochastic methods for optimization and machine learning: Bilevel stochastic descent and DARTS

Two-level stochastic optimization formulations have become instrumental ...
research
10/02/2020

Are Artificial Dendrites useful in NeuroEvolution?

The significant role of dendritic processing within neuronal networks ha...
research
08/29/2022

Rosenblatt's first theorem and frugality of deep learning

First Rosenblatt's theorem about omnipotence of shallow networks states ...
research
08/31/2018

An Adaptive Locally Connected Neuron Model: Focusing Neuron

We present a new artificial neuron model capable of learning its recepti...

Please sign up or login with your details

Forgot password? Click here to reset