A Stochastic Variance Reduced Nesterov's Accelerated Quasi-Newton Method

10/17/2019
by   Sota Yasuda, et al.
0

Recently algorithms incorporating second order curvature information have become popular in training neural networks. The Nesterov's Accelerated Quasi-Newton (NAQ) method has shown to effectively accelerate the BFGS quasi-Newton method by incorporating the momentum term and Nesterov's accelerated gradient vector. A stochastic version of NAQ method was proposed for training of large-scale problems. However, this method incurs high stochastic variance noise. This paper proposes a stochastic variance reduced Nesterov's Accelerated Quasi-Newton method in full (SVR-NAQ) and limited (SVRLNAQ) memory forms. The performance of the proposed method is evaluated in Tensorflow on four benchmark problems - two regression and two classification problems respectively. The results show improved performance compared to conventional methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2019

A Stochastic Quasi-Newton Method with Nesterov's Accelerated Gradient

Incorporating second order curvature information in gradient based metho...
research
09/09/2019

An Adaptive Stochastic Nesterov Accelerated Quasi Newton Method for Training RNNs

A common problem in training neural networks is the vanishing and/or exp...
research
12/01/2021

A modified limited memory Nesterov's accelerated quasi-Newton

The Nesterov's accelerated quasi-Newton (L)NAQ method has shown to accel...
research
04/06/2020

Deep Neural Network Learning with Second-Order Optimizers – a Practical Study with a Stochastic Quasi-Gauss-Newton Method

Training in supervised deep learning is computationally demanding, and t...
research
12/10/2019

A Stochastic Quasi-Newton Method for Large-Scale Nonconvex Optimization with Applications

This paper proposes a novel stochastic version of damped and regularized...
research
01/19/2022

Variance-Reduced Stochastic Quasi-Newton Methods for Decentralized Learning: Part I

In this work, we investigate stochastic quasi-Newton methods for minimiz...
research
02/26/2018

GPU Accelerated Sub-Sampled Newton's Method

First order methods, which solely rely on gradient information, are comm...

Please sign up or login with your details

Forgot password? Click here to reset