Convex and Non-Convex Optimization under Generalized Smoothness

06/02/2023
by   Haochuan Li, et al.
0

Classical analysis of convex and non-convex optimization methods often requires the Lipshitzness of the gradient, which limits the analysis to functions bounded by quadratics. Recent work relaxed this requirement to a non-uniform smoothness condition with the Hessian norm bounded by an affine function of the gradient norm, and proved convergence in the non-convex setting via gradient clipping, assuming bounded noise. In this paper, we further generalize this non-uniform smoothness condition and develop a simple, yet powerful analysis technique that bounds the gradients along the trajectory, thereby leading to stronger results for both convex and non-convex optimization problems. In particular, we obtain the classical convergence rates for (stochastic) gradient descent and Nesterov's accelerated gradient method in the convex and/or non-convex setting under this general smoothness condition. The new analysis approach does not require gradient clipping and allows heavy-tailed noise with bounded variance in the stochastic setting.

READ FULL TEXT
research
03/02/2023

Variance-reduced Clipping for Non-convex Optimization

Gradient clipping is a standard training technique used in deep learning...
research
05/13/2021

Leveraging Non-uniformity in First-order Non-convex Optimization

Classical global convergence results for first-order methods rely on uni...
research
06/17/2023

Adaptive Strategies in Non-convex Optimization

An algorithm is said to be adaptive to a certain parameter (of the probl...
research
10/25/2018

Uniform Convergence of Gradients for Non-Convex Learning and Optimization

We investigate 1) the rate at which refined properties of the empirical ...
research
04/27/2023

Convergence of Adam Under Relaxed Assumptions

In this paper, we provide a rigorous proof of convergence of the Adaptiv...
research
02/27/2022

Thinking Outside the Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization

We consider linear prediction with a convex Lipschitz loss, or more gene...
research
04/11/2022

Non-Convex Optimization with Certificates and Fast Rates Through Kernel Sums of Squares

We consider potentially non-convex optimization problems, for which opti...

Please sign up or login with your details

Forgot password? Click here to reset