Distributed estimation of the inverse Hessian by determinantal averaging

05/28/2019
by   Michał Dereziński, et al.
0

In distributed optimization and distributed numerical linear algebra, we often encounter an inversion bias: if we want to compute a quantity that depends on the inverse of a sum of distributed matrices, then the sum of the inverses does not equal the inverse of the sum. An example of this occurs in distributed Newton's method, where we wish to compute (or implicitly work with) the inverse Hessian multiplied by the gradient. In this case, locally computed estimates are biased, and so taking a uniform average will not recover the correct solution. To address this, we propose determinantal averaging, a new approach for correcting the inversion bias. This approach involves reweighting the local estimates of the Newton's step proportionally to the determinant of the local Hessian estimate, and then averaging them together to obtain an improved global estimate. This method provides the first known distributed Newton step that is asymptotically consistent, i.e., it recovers the exact step in the limit as the number of distributed partitions grows to infinity. To show this, we develop new expectation identities and moment bounds for the determinant and adjugate of a random matrix. Determinantal averaging can be applied not only to Newton's method, but to computing any quantity that is a linear tranformation of a matrix inverse, e.g., taking a trace of the inverse covariance matrix, which is used in data uncertainty quantification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/20/2021

EiGLasso for Scalable Sparse Kronecker-Sum Inverse Covariance Estimation

In many real-world problems, complex dependencies are present both among...
research
07/02/2020

Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization

In distributed second order optimization, a standard strategy is to aver...
research
04/20/2022

Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence

We consider minimizing a smooth and strongly convex objective function u...
research
11/21/2020

Sparse sketches with small inversion bias

For a tall n× d matrix A and a random m× n sketching matrix S, the sketc...
research
02/04/2018

Uncertainty Quantification of the time averaging of a Statistics Computed from Numerical Simulation of Turbulent Flow

Rigorous assessment of the uncertainty is crucial to the utility of nume...
research
10/03/2020

Secant Penalized BFGS: A Noise Robust Quasi-Newton Method Via Penalizing The Secant Condition

In this paper, we introduce a new variant of the BFGS method designed to...
research
11/19/2020

On the asymptotic rate of convergence of Stochastic Newton algorithms and their Weighted Averaged versions

The majority of machine learning methods can be regarded as the minimiza...

Please sign up or login with your details

Forgot password? Click here to reset