Truncated Back-propagation for Bilevel Optimization

10/25/2018
by   Amirreza Shaban, et al.
0

Bilevel optimization has been recently revisited for designing and analyzing algorithms in hyperparameter tuning and meta learning tasks. However, due to its nested structure, evaluating exact gradients for high-dimensional problems is computationally challenging. One heuristic to circumvent this difficulty is to use the approximate gradient given by performing truncated back-propagation through the iterative optimization procedure that solves the lower-level problem. Although promising empirical performance has been reported, its theoretical properties are still unclear. In this paper, we analyze the properties of this family of approximate gradients and establish sufficient conditions for convergence. We validate this on several hyperparameter tuning and meta learning tasks. We find that optimization with the approximate gradient computed using few-step back-propagation often performs comparably to optimization with the exact gradient, while requiring far less memory and half the computation time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2018

Far-HO: A Bilevel Programming Package for Hyperparameter Optimization and Meta-Learning

In (Franceschi et al., 2018) we proposed a unified mathematical framewor...
research
06/13/2018

Bilevel Programming for Hyperparameter Optimization and Meta-Learning

We introduce a framework based on bilevel programming that unifies gradi...
research
11/13/2020

Convergence Properties of Stochastic Hypergradients

Bilevel optimization problems are receiving increasing attention in mach...
research
10/03/2020

Expectigrad: Fast Stochastic Optimization with Robust Convergence Properties

Many popular adaptive gradient methods such as Adam and RMSProp rely on ...
research
06/05/2020

UFO-BLO: Unbiased First-Order Bilevel Optimization

Bilevel optimization (BLO) is a popular approach with many applications ...
research
10/16/2021

Meta-Learning with Adjoint Methods

Model Agnostic Meta-Learning (MAML) is widely used to find a good initia...
research
12/13/2022

Multi-objective Tree-structured Parzen Estimator Meets Meta-learning

Hyperparameter optimization (HPO) is essential for the better performanc...

Please sign up or login with your details

Forgot password? Click here to reset