Sub-Sampled Newton Methods I: Globally Convergent Algorithms

01/18/2016
by   Farbod Roosta-Khorasani, et al.
0

Large scale optimization problems are ubiquitous in machine learning and data analysis and there is a plethora of algorithms for solving such problems. Many of these algorithms employ sub-sampling, as a way to either speed up the computations and/or to implicitly implement a form of statistical regularization. In this paper, we consider second-order iterative optimization algorithms and we provide bounds on the convergence of the variants of Newton's method that incorporate uniform sub-sampling as a means to estimate the gradient and/or Hessian. Our bounds are non-asymptotic and quantitative. Our algorithms are global and are guaranteed to converge from any initial iterate. Using random matrix concentration inequalities, one can sub-sample the Hessian to preserve the curvature information. Our first algorithm incorporates Hessian sub-sampling while using the full gradient. We also give additional convergence results for when the sub-sampled Hessian is regularized by modifying its spectrum or ridge-type regularization. Next, in addition to Hessian sub-sampling, we also consider sub-sampling the gradient as a way to further reduce the computational complexity per iteration. We use approximate matrix multiplication results from randomized numerical linear algebra to obtain the proper sampling strategy. In all these algorithms, computing the update boils down to solving a large scale linear system, which can be computationally expensive. As a remedy, for all of our algorithms, we also give global convergence results for the case of inexact updates where such linear system is solved only approximately. This paper has a more advanced companion paper, [42], in which we demonstrate that, by doing a finer-grained analysis, we can get problem-independent bounds for local convergence of these algorithms and explore trade-offs to improve upon the basic results of the present paper.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/18/2016

Sub-Sampled Newton Methods II: Local Convergence Rates

Many data-fitting applications require the solution of an optimization p...
research
02/12/2020

A Random-Feature Based Newton Method for Empirical Risk Minimization in Reproducing Kernel Hilbert Space

In supervised learning using kernel methods, we encounter a large-scale ...
research
03/25/2021

Regularization by Denoising Sub-sampled Newton Method for Spectral CT Multi-Material Decomposition

Spectral Computed Tomography (CT) is an emerging technology that enables...
research
02/26/2018

GPU Accelerated Sub-Sampled Newton's Method

First order methods, which solely rely on gradient information, are comm...
research
02/10/2015

Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments

In this era of large-scale data, distributed systems built on top of clu...
research
07/02/2016

Sub-sampled Newton Methods with Non-uniform Sampling

We consider the problem of finding the minimizer of a convex function F:...
research
11/28/2015

Newton-Stein Method: An optimization method for GLMs via Stein's Lemma

We consider the problem of efficiently computing the maximum likelihood ...

Please sign up or login with your details

Forgot password? Click here to reset