Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems

06/25/2021
by   Tianyi Chen, et al.
2

Stochastic nested optimization, including stochastic compositional, min-max and bilevel optimization, is gaining popularity in many machine learning applications. While the three problems share the nested structure, existing works often treat them separately, and thus develop problem-specific algorithms and their analyses. Among various exciting developments, simple SGD-type updates (potentially on multiple variables) are still prevalent in solving this class of nested problems, but they are believed to have slower convergence rate compared to that of the non-nested problems. This paper unifies several SGD-type updates for stochastic nested problems into a single SGD approach that we term ALternating Stochastic gradient dEscenT (ALSET) method. By leveraging the hidden smoothness of the problem, this paper presents a tighter analysis of ALSET for stochastic nested problems. Under the new analysis, to achieve an ϵ-stationary point of the nested problem, it requires O(ϵ^-2) samples. Under certain regularity conditions, applying our results to stochastic compositional, min-max and reinforcement learning problems either improves or matches the best-known sample complexity in the respective cases. Our results explain why simple SGD-type algorithms in stochastic nested problems all work very well in practice without the need for further modifications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/25/2020

Solving Stochastic Compositional Optimization is Nearly as Easy as Solving Stochastic Optimization

Stochastic compositional optimization generalizes classic (non-compositi...
research
11/14/2022

Alternating Implicit Projected SGD and Its Efficient Variants for Equality-constrained Bilevel Optimization

Stochastic bilevel optimization, which captures the inherent nested stru...
research
07/19/2022

Riemannian Stochastic Gradient Method for Nested Composition Optimization

This work considers optimization of composition of functions in a nested...
research
02/12/2021

Stability and Convergence of Stochastic Gradient Clipping: Beyond Lipschitz Continuity and Smoothness

Stochastic gradient algorithms are often unstable when applied to functi...
research
05/04/2022

FEDNEST: Federated Bilevel, Minimax, and Compositional Optimization

Standard federated optimization methods successfully apply to stochastic...
research
10/23/2022

Decentralized Stochastic Bilevel Optimization with Improved Per-Iteration Complexity

Bilevel optimization recently has received tremendous attention due to i...
research
03/09/2015

Learning Co-Sparse Analysis Operators with Separable Structures

In the co-sparse analysis model a set of filters is applied to a signal ...

Please sign up or login with your details

Forgot password? Click here to reset