Learning-Augmented Sketches for Hessians

02/24/2021
by   Yi Li, et al.
0

Sketching is a dimensionality reduction technique where one compresses a matrix by linear combinations that are typically chosen at random. A line of work has shown how to sketch the Hessian to speed up each iteration in a second order method, but such sketches usually depend only on the matrix at hand, and in a number of cases are even oblivious to the input matrix. One could instead hope to learn a distribution on sketching matrices that is optimized for the specific distribution of input matrices. We show how to design learned sketches for the Hessian in the context of second order methods, where we learn potentially different sketches for the different iterations of an optimization procedure. We show empirically that learned sketches, compared with their "non-learned" counterparts, improve the approximation accuracy for important problems, including LASSO, SVM, and matrix estimation with nuclear norm constraints. Several of our schemes can be proven to perform no worse than their unlearned counterparts. Additionally, we show that a smaller sketching dimension of the column space of a tall matrix is possible, assuming an oracle for predicting rows which have a large leverage score.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/01/2022

Second-order optimization with lazy Hessians

We analyze Newton's method with lazy Hessian updates for solving general...
research
06/11/2023

Learning the Positions in CountSketch

We consider sketching algorithms which first compress data by multiplica...
research
06/16/2021

Non-PSD Matrix Sketching with Applications to Regression and Optimization

A variety of dimensionality reduction techniques have been applied for c...
research
06/01/2020

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

We introduce AdaHessian, a second order stochastic optimization algorith...
research
12/01/2022

Generalizing and Improving Jacobian and Hessian Regularization

Jacobian and Hessian regularization aim to reduce the magnitude of the f...
research
10/18/2019

First-Order Preconditioning via Hypergradient Descent

Standard gradient descent methods are susceptible to a range of issues t...

Please sign up or login with your details

Forgot password? Click here to reset