Subgradient methods near active manifolds: saddle point avoidance, local convergence, and asymptotic normality

08/26/2021
by   Damek Davis, et al.
0

Nonsmooth optimization problems arising in practice tend to exhibit beneficial smooth substructure: their domains stratify into "active manifolds" of smooth variation, which common proximal algorithms "identify" in finite time. Identification then entails a transition to smooth dynamics, and accommodates second-order acceleration techniques. While identification is clearly useful algorithmically, empirical evidence suggests that even those algorithms that do not identify the active manifold in finite time – notably the subgradient method – are nonetheless affected by it. This work seeks to explain this phenomenon, asking: how do active manifolds impact the subgradient method in nonsmooth optimization? In this work, we answer this question by introducing two algorithmically useful properties – aiming and subgradient approximation – that fully expose the smooth substructure of the problem. We show that these properties imply that the shadow of the (stochastic) subgradient method along the active manifold is precisely an inexact Riemannian gradient method with an implicit retraction. We prove that these properties hold for a wide class of problems, including cone reducible/decomposable functions and generic semialgebraic problems. Moreover, we develop a thorough calculus, proving such properties are preserved under smooth deformations and spectral lifts. This viewpoint then leads to several algorithmic consequences that parallel results in smooth optimization, despite the nonsmoothness of the problem: local rates of convergence, asymptotic normality, and saddle point avoidance. The asymptotic normality results appear to be new even in the most classical setting of stochastic nonlinear programming. The results culminate in the following observation: the perturbed subgradient method on generic, Clarke regular semialgebraic problems, converges only to local minimizers.

READ FULL TEXT

page 6

page 20

page 33

research
06/18/2019

Escaping from saddle points on Riemannian manifolds

We consider minimizing a nonconvex, smooth function f on a Riemannian ma...
research
10/09/2019

Nonconvex stochastic optimization on manifolds via Riemannian Frank-Wolfe methods

We study stochastic projection-free methods for constrained optimization...
research
04/09/2020

Spectral Discovery of Jointly Smooth Features for Multimodal Data

In this paper, we propose a spectral method for deriving functions that ...
research
01/16/2023

Asymptotic normality and optimality in nonsmooth stochastic approximation

In their seminal work, Polyak and Juditsky showed that stochastic approx...
research
09/06/2021

Stochastic Subgradient Descent on a Generic Definable Function Converges to a Minimizer

It was previously shown by Davis and Drusvyatskiy that every Clarke crit...
research
06/26/2020

Newton retraction as approximate geodesics on submanifolds

Efficient approximation of geodesics is crucial for practical algorithms...
research
08/04/2021

Stochastic Subgradient Descent Escapes Active Strict Saddles

In non-smooth stochastic optimization, we establish the non-convergence ...

Please sign up or login with your details

Forgot password? Click here to reset