Sub-Sampled Newton Methods II: Local Convergence Rates

01/18/2016
by   Farbod Roosta-Khorasani, et al.
0

Many data-fitting applications require the solution of an optimization problem involving a sum of large number of functions of high dimensional parameter. Here, we consider the problem of minimizing a sum of n functions over a convex constraint set X⊆R^p where both n and p are large. In such problems, sub-sampling as a way to reduce n can offer great amount of computational efficiency. Within the context of second order methods, we first give quantitative local convergence results for variants of Newton's method where the Hessian is uniformly sub-sampled. Using random matrix concentration inequalities, one can sub-sample in a way that the curvature information is preserved. Using such sub-sampling strategy, we establish locally Q-linear and Q-superlinear convergence rates. We also give additional convergence results for when the sub-sampled Hessian is regularized by modifying its spectrum or Levenberg-type regularization. Finally, in addition to Hessian sub-sampling, we consider sub-sampling the gradient as way to further reduce the computational complexity per iteration. We use approximate matrix multiplication results from randomized numerical linear algebra (RandNLA) to obtain the proper sampling strategy and we establish locally R-linear convergence rates. In such a setting, we also show that a very aggressive sample size increase results in a R-superlinearly convergent algorithm. While the sample size depends on the condition number of the problem, our convergence rates are problem-independent, i.e., they do not depend on the quantities related to the problem. Hence, our analysis here can be used to complement the results of our basic framework from the companion paper, [38], by exploring algorithmic trade-offs that are important in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/18/2016

Sub-Sampled Newton Methods I: Globally Convergent Algorithms

Large scale optimization problems are ubiquitous in machine learning and...
research
08/12/2015

Convergence rates of sub-sampled Newton methods

We consider the problem of minimizing a sum of n functions over a convex...
research
04/13/2011

Hybrid Deterministic-Stochastic Methods for Data Fitting

Many structured data-fitting applications require the solution of an opt...
research
11/28/2015

Newton-Stein Method: An optimization method for GLMs via Stein's Lemma

We consider the problem of efficiently computing the maximum likelihood ...
research
02/26/2018

GPU Accelerated Sub-Sampled Newton's Method

First order methods, which solely rely on gradient information, are comm...
research
02/12/2020

A Random-Feature Based Newton Method for Empirical Risk Minimization in Reproducing Kernel Hilbert Space

In supervised learning using kernel methods, we encounter a large-scale ...
research
07/02/2016

Sub-sampled Newton Methods with Non-uniform Sampling

We consider the problem of finding the minimizer of a convex function F:...

Please sign up or login with your details

Forgot password? Click here to reset