Sharper Analysis for Minibatch Stochastic Proximal Point Methods: Stability, Smoothness, and Deviation

01/09/2023
by   Xiao-Tong Yuan, et al.
0

The stochastic proximal point (SPP) methods have gained recent attention for stochastic optimization, with strong convergence guarantees and superior robustness to the classic stochastic gradient descent (SGD) methods showcased at little to no cost of computational overhead added. In this article, we study a minibatch variant of SPP, namely M-SPP, for solving convex composite risk minimization problems. The core contribution is a set of novel excess risk bounds of M-SPP derived through the lens of algorithmic stability theory. Particularly under smoothness and quadratic growth conditions, we show that M-SPP with minibatch-size n and iteration count T enjoys an in-expectation fast rate of convergence consisting of an 𝒪(1/T^2) bias decaying term and an 𝒪(1/nT) variance decaying term. In the small-n-large-T setting, this result substantially improves the best known results of SPP-type approaches by revealing the impact of noise level of model on convergence rate. In the complementary small-T-large-n regime, we provide a two-phase extension of M-SPP to achieve comparable convergence rates. Moreover, we derive a near-tight high probability (over the randomness of data) bound on the parameter estimation error of a sampling-without-replacement variant of M-SPP. Numerical evidences are provided to support our theoretical predictions when substantialized to Lasso and logistic regression models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/19/2021

Improved Learning Rates for Stochastic Optimization: Two Theoretical Viewpoints

Generalization performance of stochastic optimization stands a central p...
research
01/22/2019

On convergence rate of stochastic proximal point algorithm without strong convexity, smoothness or bounded gradients

Significant parts of the recent learning literature on stochastic optimi...
research
08/03/2023

Online covariance estimation for stochastic gradient descent under Markovian sampling

We study the online overlapping batch-means covariance estimator for Sto...
research
02/12/2021

Stability and Convergence of Stochastic Gradient Clipping: Beyond Lipschitz Continuity and Smoothness

Stochastic gradient algorithms are often unstable when applied to functi...
research
04/06/2022

High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize

In this paper, we propose a new, simplified high probability analysis of...
research
03/17/2022

Stability and Risk Bounds of Iterative Hard Thresholding

In this paper, we analyze the generalization performance of the Iterativ...
research
09/16/2022

Stability and Generalization for Markov Chain Stochastic Gradient Methods

Recently there is a large amount of work devoted to the study of Markov ...

Please sign up or login with your details

Forgot password? Click here to reset