The Complexity of Making the Gradient Small in Stochastic Convex Optimization

02/13/2019
by   Dylan Foster, et al.
0

We give nearly matching upper and lower bounds on the oracle complexity of finding ϵ-stationary points (∇ F(x) ≤ϵ) in stochastic convex optimization. We jointly analyze the oracle complexity in both the local stochastic oracle model and the global oracle (or, statistical learning) model. This allows us to decompose the complexity of finding near-stationary points into optimization complexity and sample complexity, and reveals some surprising differences between the complexity of stochastic optimization versus learning. Notably, we show that in the global oracle/statistical learning model, only logarithmic dependence on smoothness is required to find a near-stationary point, whereas polynomial dependence on smoothness is necessary in the local stochastic oracle model. In other words, the separation in complexity between the two models can be exponential, and that the folklore understanding that smoothness is required to find stationary points is only weakly true for statistical learning. Our upper bounds are based on extensions of a recent "recursive regularization" technique proposed by Allen-Zhu (2018). We show how to extend the technique to achieve near-optimal rates, and in particular show how to leverage the extra information available in the global oracle model. Our algorithm for the global model can be implemented efficiently through finite sum methods, and suggests an interesting new computational-statistical tradeoff.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2010

Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization

Relative to the large literature on upper bounds on complexity of convex...
research
08/11/2022

Near-Optimal Algorithms for Making the Gradient Small in Stochastic Minimax Optimization

We study the problem of finding a near-stationary point for smooth minim...
research
09/21/2022

On the Complexity of Finding Small Subgradients in Nonsmooth Optimization

We study the oracle complexity of producing (δ,ϵ)-stationary points of L...
research
06/26/2023

Near-Optimal Fully First-Order Algorithms for Finding Stationary Points in Bilevel Optimization

Bilevel optimization has various applications such as hyper-parameter op...
research
02/20/2023

Private (Stochastic) Non-Convex Optimization Revisited: Second-Order Stationary Points and Excess Risks

We consider the problem of minimizing a non-convex objective while prese...
research
01/24/2020

Limits on Gradient Compression for Stochastic Optimization

We consider stochastic optimization over ℓ_p spaces using access to a fi...
research
05/25/2021

Practical Schemes for Finding Near-Stationary Points of Convex Finite-Sums

The problem of finding near-stationary points in convex optimization has...

Please sign up or login with your details

Forgot password? Click here to reset