A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization
This paper proposes a new algorithm – the Momentum-assisted Single-timescale Stochastic Approximation (MSTSA) – for tackling unconstrained bilevel optimization problems. We focus on bilevel problems where the lower level subproblem is strongly-convex. Unlike prior works which rely on two timescale or double loop techniques that track the optimal solution to the lower level subproblem, we design a stochastic momentum assisted gradient estimator for the upper level subproblem's updates. The latter allows us to gradually control the error in stochastic gradient updates due to inaccurate solution to the lower level subproblem. We show that if the upper objective function is smooth but possibly non-convex (resp. strongly-convex), MSTSA requires 𝒪(ϵ^-2) (resp. 𝒪(ϵ^-1)) iterations (each using constant samples) to find an ϵ-stationary (resp. ϵ-optimal) solution. This achieves the best-known guarantees for stochastic bilevel problems. We validate our theoretical results by showing the efficiency of the MSTSA algorithm on hyperparameter optimization and data hyper-cleaning problems.
READ FULL TEXT