Convergence Rates for Deterministic and Stochastic Subgradient Methods Without Lipschitz Continuity
We generalize the classic convergence rate theory for subgradient methods to apply to non-Lipschitz functions via a new measure of steepness. For the deterministic projected subgradient method, we derive a global O(1/√(T)) convergence rate for any function with at most exponential growth. Our approach implies generalizations of the standard convergence rates for gradient descent on functions with Lipschitz or Hölder continuous gradients. Further, we show a O(1/√(T)) convergence rate for the stochastic projected subgradient method on functions with at most quadratic growth, which improves to O(1/T) under strong convexity.
READ FULL TEXT