ISAAC Newton: Input-based Approximate Curvature for Newton's Method

05/01/2023
by   Felix Petersen, et al.
0

We present ISAAC (Input-baSed ApproximAte Curvature), a novel method that conditions the gradient using selected second-order information and has an asymptotically vanishing computational overhead, assuming a batch size smaller than the number of neurons. We show that it is possible to compute a good conditioner based on only the input to a respective layer without a substantial computational overhead. The proposed method allows effective training even in small-batch stochastic regimes, which makes it competitive to first-order as well as second-order methods.

READ FULL TEXT

page 6

page 7

page 15

research
09/09/2019

A Stochastic Quasi-Newton Method with Nesterov's Accelerated Gradient

Incorporating second order curvature information in gradient based metho...
research
03/07/2023

EscherNet 101

A deep learning model, EscherNet 101, is constructed to categorize image...
research
12/14/2021

SC-Reg: Training Overparameterized Neural Networks under Self-Concordant Regularization

In this paper we propose the SC-Reg (self-concordant regularization) fra...
research
12/14/2020

An Adaptive Memory Multi-Batch L-BFGS Algorithm for Neural Network Training

Motivated by the potential for parallel implementation of batch-based al...
research
11/04/2015

adaQN: An Adaptive Quasi-Newton Algorithm for Training RNNs

Recurrent Neural Networks (RNNs) are powerful models that achieve except...
research
06/04/2021

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Curvature in form of the Hessian or its generalized Gauss-Newton (GGN) a...
research
09/09/2019

An Adaptive Stochastic Nesterov Accelerated Quasi Newton Method for Training RNNs

A common problem in training neural networks is the vanishing and/or exp...

Please sign up or login with your details

Forgot password? Click here to reset