On Momentum-Based Gradient Methods for Bilevel Optimization with Nonconvex Lower-Level

03/07/2023
by   Feihu Huang, et al.
0

Bilevel optimization is a popular two-level hierarchical optimization, which has been widely applied to many machine learning tasks such as hyperparameter learning, meta learning and continual learning. Although many bilevel optimization methods recently have been developed, the bilevel methods are not well studied when the lower-level problem is nonconvex. To fill this gap, in the paper, we study a class of nonconvex bilevel optimization problems, which both upper-level and lower-level problems are nonconvex, and the lower-level problem satisfies Polyak-Lojasiewicz (PL) condition. We propose an efficient momentum-based gradient bilevel method (MGBiO) to solve these deterministic problems. Meanwhile, we propose a class of efficient momentum-based stochastic gradient bilevel methods (MSGBiO and VR-MSGBiO) to solve these stochastic problems. Moreover, we provide a useful convergence analysis framework for our methods. Specifically, under some mild conditions, we prove that our MGBiO method has a sample (or gradient) complexity of O(ϵ^-2) for finding an ϵ-stationary solution of the deterministic bilevel problems (i.e., ∇ F(x)≤ϵ), which improves the existing best results by a factor of O(ϵ^-1). Meanwhile, we prove that our MSGBiO and VR-MSGBiO methods have sample complexities of Õ(ϵ^-4) and Õ(ϵ^-3), respectively, in finding an ϵ-stationary solution of the stochastic bilevel problems (i.e., 𝔼∇ F(x)≤ϵ), which improves the existing best results by a factor of O(ϵ^-3). This manuscript commemorates the mathematician Boris Polyak (1935 -2023).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2023

Enhanced Adaptive Gradient Algorithms for Nonconvex-PL Minimax Optimization

In the paper, we study a class of nonconvex nonconcave minimax optimizat...
research
06/04/2023

A Generalized Alternating Method for Bilevel Optimization under the Polyak-Łojasiewicz Condition

Bilevel optimization has recently regained interest owing to its applica...
research
01/26/2023

A Fully First-Order Method for Stochastic Bilevel Optimization

We consider stochastic unconstrained bilevel optimization problems when ...
research
11/13/2020

Convergence Properties of Stochastic Hypergradients

Bilevel optimization problems are receiving increasing attention in mach...
research
07/26/2021

Enhanced Bilevel Optimization via Bregman Distance

Bilevel optimization has been widely applied many machine learning probl...
research
09/12/2022

Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization

Nonsmooth nonconvex optimization problems broadly emerge in machine lear...
research
10/03/2020

Expectigrad: Fast Stochastic Optimization with Robust Convergence Properties

Many popular adaptive gradient methods such as Adam and RMSProp rely on ...

Please sign up or login with your details

Forgot password? Click here to reset