Trust-Region Algorithms for Training Responses: Machine Learning Methods Using Indefinite Hessian Approximations

07/01/2018
by   Jennifer B. Erway, et al.
0

Machine learning (ML) problems are often posed as highly nonlinear and nonconvex unconstrained optimization problems. Methods for solving ML problems based on stochastic gradient descent are easily scaled for very large problems but may involve fine-tuning many hyper-parameters. Quasi-Newton approaches based on the limited-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) update typically do not require manually tuning hyper-parameters but suffer from approximating a potentially indefinite Hessian with a positive-definite matrix. Hessian-free methods leverage the ability to perform Hessian-vector multiplication without needing the entire Hessian matrix, but each iteration's complexity is significantly greater than quasi-Newton methods. In this paper we propose an alternative approach for solving ML problems based on a quasi-Newton trust-region framework for solving large-scale optimization problems that allow for indefinite Hessian approximations. Numerical experiments on a standard testing data set show that with a fixed computational time budget, the proposed methods achieve better results than the traditional limited-memory BFGS and the Hessian-free methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/04/2019

Quasi-Newton Optimization Methods For Deep Learning Applications

Deep learning algorithms often require solving a highly non-linear and n...
research
07/25/2023

mL-BFGS: A Momentum-based L-BFGS for Distributed Large-Scale Neural Network Optimization

Quasi-Newton methods still face significant challenges in training large...
research
11/06/2018

Quasi-Newton Optimization in Deep Q-Learning for Playing ATARI Games

Reinforcement Learning (RL) algorithms allow artificial agents to improv...
research
07/13/2021

A New Multipoint Symmetric Secant Method with a Dense Initial Matrix

In large-scale optimization, when either forming or storing Hessian matr...
research
01/28/2019

Quasi-Newton Methods for Deep Learning: Forget the Past, Just Sample

We present two sampled quasi-Newton methods for deep learning: sampled L...
research
08/02/2021

Computing the Newton-step faster than Hessian accumulation

Computing the Newton-step of a generic function with N decision variable...
research
02/12/2018

Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization

We present the first accelerated randomized algorithm for solving linear...

Please sign up or login with your details

Forgot password? Click here to reset