Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update

07/15/2021
by   Michał Dereziński, et al.
9

In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration. Randomized sketching has emerged as a powerful technique for constructing estimates of the Hessian which can be used to perform approximate Newton steps. This involves multiplication by a random sketching matrix, which introduces a trade-off between the computational cost of sketching and the convergence rate of the optimization algorithm. A theoretically desirable but practically much too expensive choice is to use a dense Gaussian sketching matrix, which produces unbiased estimates of the exact Newton step and which offers strong problem-independent convergence guarantees. We show that the Gaussian sketching matrix can be drastically sparsified, significantly reducing the computational cost of sketching, without substantially affecting its convergence properties. This approach, called Newton-LESS, is based on a recently introduced sketching technique: LEverage Score Sparsified (LESS) embeddings. We prove that Newton-LESS enjoys nearly the same problem-independent local convergence rate as Gaussian embeddings, not just up to constant factors but even down to lower order terms, for a large class of optimization tasks. In particular, this leads to a new state-of-the-art convergence result for an iterative least squares solver. Finally, we extend LESS embeddings to include uniformly sparsified random sign matrices which can be implemented efficiently and which perform well in numerical experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2020

SPAN: A Stochastic Projected Approximate Newton Method

Second-order optimization methods have desirable convergence properties....
research
04/20/2022

Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence

We consider minimizing a smooth and strongly convex objective function u...
research
06/26/2020

Newton retraction as approximate geodesics on submanifolds

Efficient approximation of geodesics is crucial for practical algorithms...
research
03/21/2019

OverSketched Newton: Fast Convex Optimization for Serverless Systems

Motivated by recent developments in serverless systems for large-scale m...
research
05/23/2019

Scale Invariant Power Iteration

Power iteration has been generalized to solve many interesting problems ...
research
11/06/2019

Faster Least Squares Optimization

We investigate randomized methods for solving overdetermined linear leas...
research
02/21/2020

Optimal Randomized First-Order Methods for Least-Squares Problems

We provide an exact analysis of a class of randomized algorithms for sol...

Please sign up or login with your details

Forgot password? Click here to reset