A Single-Timescale Stochastic Bilevel Optimization Method

by   Tianyi Chen, et al.

Stochastic bilevel optimization generalizes the classic stochastic optimization from the minimization of a single objective to the minimization of an objective function that depends the solution of another optimization problem. Recently, stochastic bilevel optimization is regaining popularity in emerging machine learning applications such as hyper-parameter optimization and model-agnostic meta learning. To solve this class of stochastic optimization problems, existing methods require either double-loop or two-timescale updates, which are sometimes less efficient. This paper develops a new optimization method for a class of stochastic bilevel problems that we term Single-Timescale stochAstic BiLevEl optimization (STABLE) method. STABLE runs in a single loop fashion, and uses a single-timescale update with a fixed batch size. To achieve an ϵ-stationary point of the bilevel problem, STABLE requires O(ϵ^-2) samples in total; and to achieve an ϵ-optimal solution in the strongly convex case, STABLE requires O(ϵ^-1) samples. To the best of our knowledge, this is the first bilevel optimization algorithm achieving the same order of sample complexity as the stochastic gradient descent method for the single-level stochastic optimization.


page 1

page 2

page 3

page 4


Solving Stochastic Compositional Optimization is Nearly as Easy as Solving Stochastic Optimization

Stochastic compositional optimization generalizes classic (non-compositi...

A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization

This paper proposes a new algorithm – the Momentum-assisted Single-times...

Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-Start

We analyze a general class of bilevel problems, in which the upper-level...

Efficient Optimization of Loops and Limits with Randomized Telescoping Sums

We consider optimization problems in which the objective requires an inn...

Deep-Learning-Enabled Simulated Annealing for Topology Optimization

Topology optimization by distributing materials in a domain requires sto...

The importance of better models in stochastic optimization

Standard stochastic optimization methods are brittle, sensitive to steps...

Emergent Jaw Predominance in Vocal Development through Stochastic Optimization

Infant vocal babbling strongly relies on jaw oscillations, especially at...