Structured Stochastic Quasi-Newton Methods for Large-Scale Optimization Problems

06/17/2020
by   Minghan Yang, et al.
17

In this paper, we consider large-scale finite-sum nonconvex problems arising from machine learning. Since the Hessian is often a summation of a relative cheap and accessible part and an expensive or even inaccessible part, a stochastic quasi-Newton matrix is constructed using partial Hessian information as much as possible. By further exploiting the low-rank structures based on the Nyström approximation, the computation of the quasi-Newton direction is affordable. To make full use of the gradient estimation, we also develop an extra-step strategy for this framework. Global convergence to stationary point in expectation and local suplinear convergence rate are established under some mild assumptions. Numerical experiments on logistic regression, deep autoencoder networks and deep learning problems show that the efficiency of our proposed method is at least comparable with the state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/27/2016

Exact and Inexact Subsampled Newton Methods for Optimization

The paper studies the solution of stochastic optimization problems in wh...
research
05/27/2020

PNKH-B: A Projected Newton-Krylov Method for Large-Scale Bound-Constrained Optimization

We present PNKH-B, a projected Newton-Krylov method with a low-rank appr...
research
06/07/2023

Quasi-Newton Updating for Large-Scale Distributed Learning

Distributed computing is critically important for modern statistical ana...
research
08/19/2021

Using Multilevel Circulant Matrix Approximate to Speed Up Kernel Logistic Regression

Kernel logistic regression (KLR) is a classical nonlinear classifier in ...
research
12/10/2020

Stochastic Damped L-BFGS with Controlled Norm of the Hessian Approximation

We propose a new stochastic variance-reduced damped L-BFGS algorithm, wh...
research
06/10/2020

Sketchy Empirical Natural Gradient Methods for Deep Learning

In this paper, we develop an efficient sketchy empirical natural gradien...
research
03/27/2013

Efficiently Using Second Order Information in Large l1 Regularization Problems

We propose a novel general algorithm LHAC that efficiently uses second-o...

Please sign up or login with your details

Forgot password? Click here to reset