Globally Convergent Newton Methods for Ill-conditioned Generalized Self-concordant Losses

07/03/2019
by   Ulysse Marteau-Ferey, et al.
7

In this paper, we study large-scale convex optimization algorithms based on the Newton method applied to regularized generalized self-concordant losses, which include logistic regression and softmax regression. We first prove that our new simple scheme based on a sequence of problems with decreasing regularization parameters is provably globally convergent, that this convergence is linear with a constant factor which scales only logarithmically with the condition number. In the parametric setting, we obtain an algorithm with the same scaling than regular first-order methods but with an improved behavior, in particular in ill-conditioned problems. Second, in the non parametric machine learning setting, we provide an explicit algorithm combining the previous scheme with Nyström projection techniques, and prove that it achieves optimal generalization bounds with a time complexity of order O(ndf λ), a memory complexity of order O(df 2 λ) and no dependence on the condition number, generalizing the results known for least-squares regression. Here n is the number of observations and df λ is the associated degrees of freedom. In particular, this is the first large-scale algorithm to solve logistic and softmax regressions in the non-parametric setting with large condition numbers and theoretical guarantees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2019

Beyond Least-Squares: Fast Rates for Regularized Empirical Risk Minimization through Self-Concordance

We consider learning methods based on the regularization of a convex emp...
research
06/15/2017

Second-Order Kernel Online Convex Optimization with Adaptive Sketching

Kernel online convex optimization (KOCO) is a framework combining the ex...
research
05/09/2015

Newton Sketch: A Linear-time Optimization Algorithm with Linear-Quadratic Convergence

We propose a randomized second-order method for optimization known as th...
research
08/19/2021

Using Multilevel Circulant Matrix Approximate to Speed Up Kernel Logistic Regression

Kernel logistic regression (KLR) is a classical nonlinear classifier in ...
research
02/11/2022

Scale-free Unconstrained Online Learning for Curved Losses

A sequence of works in unconstrained online convex optimisation have inv...
research
09/02/2022

Optimal Diagonal Preconditioning: Theory and Practice

Preconditioning has been a staple technique in optimization and machine ...
research
09/07/2020

Escaping Saddle Points in Ill-Conditioned Matrix Completion with a Scalable Second Order Method

We propose an iterative algorithm for low-rank matrix completion that ca...

Please sign up or login with your details

Forgot password? Click here to reset