Analyzing Inexact Hypergradients for Bilevel Learning

01/11/2023
by   Matthias J. Ehrhardt, et al.
0

Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters cannot be feasibly computed and approximate strategies are required. We introduce a unified framework for computing hypergradients that generalizes existing methods based on the implicit function theorem and automatic differentiation/backpropagation, showing that these two seemingly disparate approaches are actually tightly connected. Our framework is extremely flexible, allowing its subproblems to be solved with any suitable method, to any degree of accuracy. We derive a priori and computable a posteriori error bounds for all our methods, and numerically show that our a posteriori bounds are usually more accurate. Our numerical results also show that, surprisingly, for efficient bilevel optimization, the choice of hypergradient algorithm is at least as important as the choice of lower-level solver.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2023

Implicit Differentiation for Hyperparameter Tuning the Weighted Graphical Lasso

We provide a framework and algorithm for tuning the hyperparameters of t...
research
05/04/2021

Implicit differentiation for fast hyperparameter selection in non-smooth convex learning

Finding the optimal hyperparameters of a model can be cast as a bilevel ...
research
11/28/2022

A posteriori error bounds for the block-Lanczos method for matrix function approximation

We extend the error bounds from [SIMAX, Vol. 43, Iss. 2, pp. 787-811 (20...
research
05/31/2021

Efficient and Modular Implicit Differentiation

Automatic differentiation (autodiff) has revolutionized machine learning...
research
02/20/2020

Implicit differentiation of Lasso-type models for hyperparameter optimization

Setting regularization parameters for Lasso-type estimators is notorious...
research
07/02/2016

Approximate Joint Matrix Triangularization

We consider the problem of approximate joint triangularization of a set ...
research
04/07/2021

Rademacher Complexity and Numerical Quadrature Analysis of Stable Neural Networks with Applications to Numerical PDEs

Methods for solving PDEs using neural networks have recently become a ve...

Please sign up or login with your details

Forgot password? Click here to reset