Lower Bounds for Non-Convex Stochastic Optimization

by   Yossi Arjevani, et al.

We lower bound the complexity of finding ϵ-stationary points (with gradient norm at most ϵ) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance, we prove that (in the worst case) any algorithm requires at least ϵ^-4 queries to find an ϵ stationary point. The lower bound is tight, and establishes that stochastic gradient descent is minimax optimal in this model. In a more restrictive model where the noisy gradient estimates satisfy a mean-squared smoothness property, we prove a lower bound of ϵ^-3 queries, establishing the optimality of recently proposed variance reduction techniques.


page 1

page 2

page 3

page 4


Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

We design an algorithm which finds an ϵ-approximate stationary point (wi...

Lazy Queries Can Reduce Variance in Zeroth-order Optimization

A major challenge of applying zeroth-order (ZO) methods is the high quer...

SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator

In this paper, we propose a new technique named Stochastic Path-Integrat...

Tight Query Complexity Lower Bounds for PCA via Finite Sample Deformed Wigner Law

We prove a query complexity lower bound for approximating the top r dime...

Randomized Stochastic Variance-Reduced Methods for Stochastic Bilevel Optimization

In this paper, we consider non-convex stochastic bilevel optimization (S...

Biased Stochastic Gradient Descent for Conditional Stochastic Optimization

Conditional Stochastic Optimization (CSO) covers a variety of applicatio...

Oracle lower bounds for stochastic gradient sampling algorithms

We consider the problem of sampling from a strongly log-concave density ...