Path Length Bounds for Gradient Descent and Flow

08/02/2019
by   Chirag Gupta, et al.
0

We provide path length bounds on gradient descent (GD) and flow (GF) curves for various classes of smooth convex and nonconvex functions. We make six distinct contributions: (a) we prove a meta-theorem that if GD has linear convergence towards an optimal set, then its path length is upper bounded by the distance to the optimal set multiplied by a function of the rate of convergence, (b) under the Polyak-Lojasiewicz (PL) condition (a generalization of strong convexity that allows for certain nonconvex functions), we show that the aforementioned multiplicative factor is at most √(κ), (c) we show an Ω(√(d)∧κ^1/4), times the length of the direct path, lower bound on the worst-case path length for PL functions, (d) for the special case of quadratics, we show that the bound is Θ({√(d),√(κ)}) and in some cases can be independent of κ, (e) under the weaker assumption of just convexity, where there is no natural notion of a condition number, we prove that the path length can be at most 2^10d^2 times the length of the direct path, (f) finally, for separable quasiconvex functions the path length is both upper and lower bounded by Θ(√(d)) times the length of the direct path.

READ FULL TEXT
research
06/06/2020

Unconstrained Online Optimization: Dynamic Regret Analysis of Strongly Convex and Smooth Problems

The regret bound of dynamic online learning algorithms is often expresse...
research
06/27/2019

Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond

In this paper, we provide near-optimal accelerated first-order methods f...
research
10/10/2018

Tight Dimension Independent Lower Bound on Optimal Expected Convergence Rate for Diminishing Step Sizes in SGD

We study convergence of Stochastic Gradient Descent (SGD) for strongly c...
research
05/11/2020

On Radial Isotropic Position: Theory and Algorithms

We review the theory of, and develop algorithms for transforming a finit...
research
02/22/2022

Connecting Optimization and Generalization via Gradient Flow Path Length

Optimization and generalization are two essential aspects of machine lea...
research
03/31/2012

Covering Numbers for Convex Functions

In this paper we study the covering numbers of the space of convex and u...
research
11/19/2019

Optimal Complexity and Certification of Bregman First-Order Methods

We provide a lower bound showing that the O(1/k) convergence rate of the...

Please sign up or login with your details

Forgot password? Click here to reset