Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration and Lower Bounds

by   Burak Bartan, et al.

We consider distributed optimization methods for problems where forming the Hessian is computationally challenging and communication is a significant bottleneck. We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems. We derive novel approximation guarantees for classical sketching methods and establish tight concentration results that serve as both upper and lower bounds on the error. We then extend our analysis to the accuracy of parameter averaging for distributed sketches. Furthermore, we develop unbiased parameter averaging methods for randomized second order optimization for regularized problems that employ sketching of the Hessian. Existing works do not take the bias of the estimators into consideration, which limits their application to massively parallel computation. We provide closed-form formulas for regularization parameters and step sizes that provably minimize the bias for sketched Newton directions. Additionally, we demonstrate the implications of our theoretical findings via large scale experiments on a serverless cloud computing platform.



page 1

page 2

page 3

page 4


Distributed Averaging Methods for Randomized Second Order Optimization

We consider distributed optimization problems where forming the Hessian ...

Distributed Sketching Methods for Privacy Preserving Regression

In this work, we study distributed sketching methods for large scale reg...

Second-Order Sensitivity Analysis for Bilevel Optimization

In this work we derive a second-order approach to bilevel optimization, ...

OverSketched Newton: Fast Convex Optimization for Serverless Systems

Motivated by recent developments in serverless systems for large-scale m...

Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging

We address the statistical and optimization impacts of using classical s...

Adaptive and Oblivious Randomized Subspace Methods for High-Dimensional Optimization: Sharp Analysis and Lower Bounds

We propose novel randomized optimization methods for high-dimensional co...

Concentration of Non-Isotropic Random Tensors with Applications to Learning and Empirical Risk Minimization

Dimension is an inherent bottleneck to some modern learning tasks, where...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.