Exact Stochastic Second Order Deep Learning

04/08/2021
by   Fares B. Mehouachi, et al.
0

Optimization in Deep Learning is mainly dominated by first-order methods which are built around the central concept of backpropagation. Second-order optimization methods, which take into account the second-order derivatives are far less used despite superior theoretical properties. This inadequacy of second-order methods stems from its exorbitant computational cost, poor performance, and the ineluctable non-convex nature of Deep Learning. Several attempts were made to resolve the inadequacy of second-order optimization without reaching a cost-effective solution, much less an exact solution. In this work, we show that this long-standing problem in Deep Learning could be solved in the stochastic case, given a suitable regularization of the neural network. Interestingly, we provide an expression of the stochastic Hessian and its exact eigenvalues. We provide a closed-form formula for the exact stochastic second-order Newton direction, we solve the non-convexity issue and adjust our exact solution to favor flat minima through regularization and spectral adjustment. We test our exact stochastic second-order method on popular datasets and reveal its adequacy for Deep Learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2020

Second-order Neural Network Training Using Complex-step Directional Derivative

While the superior performance of second-order optimization methods such...
research
11/28/2022

A survey of deep learning optimizers-first and second order methods

Deep Learning optimization involves minimizing a high-dimensional loss f...
research
01/15/2013

Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

Recently, we proposed to transform the outputs of each hidden neuron in ...
research
07/10/2023

A closed form exact formulation of the spectral representation of a second-order symmetric tensor and of its derivatives

The spectral decomposition of a symmetric, second-order tensor is widely...
research
04/04/2020

Optimization methods for achieving high diffraction efficiency with perfect electric conducting gratings

This work presents the implementation, analysis, and convergence study o...
research
06/15/2021

Scalable Second Order Optimization for Deep Learning

Optimization in machine learning, both theoretical and applied, is prese...
research
02/20/2020

Second Order Optimization Made Practical

Optimization in machine learning, both theoretical and applied, is prese...

Please sign up or login with your details

Forgot password? Click here to reset