Efficient Subsampled Gauss-Newton and Natural Gradient Methods for Training Neural Networks

06/05/2019
by   Yi Ren, et al.
0

We present practical Levenberg-Marquardt variants of Gauss-Newton and natural gradient methods for solving non-convex optimization problems that arise in training deep neural networks involving enormous numbers of variables and huge data sets. Our methods use subsampled Gauss-Newton or Fisher information matrices and either subsampled gradient estimates (fully stochastic) or full gradients (semi-stochastic), which, in the latter case, we prove convergent to a stationary point. By using the Sherman-Morrison-Woodbury formula with automatic differentiation (backpropagation) we show how our methods can be implemented to perform efficiently. Finally, numerical results are presented to demonstrate the effectiveness of our proposed methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2018

Distributed Newton Methods for Deep Neural Networks

Deep learning involves a difficult non-convex optimization problem with ...
research
05/18/2022

On the efficiency of Stochastic Quasi-Newton Methods for Deep Learning

While first-order methods are popular for solving optimization problems ...
research
06/16/2020

Practical Quasi-Newton Methods for Training Deep Neural Networks

We consider the development of practical stochastic quasi-Newton, and in...
research
11/14/2018

Newton Methods for Convolutional Neural Networks

Deep learning involves a difficult non-convex optimization problem, whic...
research
12/02/2021

Newton methods based convolution neural networks using parallel processing

Training of convolutional neural networks is a high dimensional and a no...
research
02/15/2023

Efficient Inversion of Matrix φ-Functions of Low Order

The paper is concerned with efficient numerical methods for solving a li...
research
05/21/2018

Never look back - A modified EnKF method and its application to the training of neural networks without back propagation

In this work, we present a new derivative-free optimization method and i...

Please sign up or login with your details

Forgot password? Click here to reset