Second-Order Sensitivity Analysis for Bilevel Optimization

05/04/2022
by   Robert Dyro, et al.
0

In this work we derive a second-order approach to bilevel optimization, a type of mathematical programming in which the solution to a parameterized optimization problem (the "lower" problem) is itself to be optimized (in the "upper" problem) as a function of the parameters. Many existing approaches to bilevel optimization employ first-order sensitivity analysis, based on the implicit function theorem (IFT), for the lower problem to derive a gradient of the lower problem solution with respect to its parameters; this IFT gradient is then used in a first-order optimization method for the upper problem. This paper extends this sensitivity analysis to provide second-order derivative information of the lower problem (which we call the IFT Hessian), enabling the usage of faster-converging second-order optimization methods at the upper level. Our analysis shows that (i) much of the computation already used to produce the IFT gradient can be reused for the IFT Hessian, (ii) errors bounds derived for the IFT gradient readily apply to the IFT Hessian, (iii) computing IFT Hessians can significantly reduce overall computation by extracting more information from each lower level solve. We corroborate our findings and demonstrate the broad range of applications of our method by applying it to problem instances of least squares hyperparameter auto-tuning, multi-class SVM auto-tuning, and inverse optimal control.

READ FULL TEXT
research
10/20/2022

HesScale: Scalable Computation of Hessian Diagonals

Second-order optimization uses curvature information about the objective...
research
01/01/2022

Batched Second-Order Adjoint Sensitivity for Reduced Space Methods

This paper presents an efficient method for extracting the second-order ...
research
12/16/2019

PETSc TSAdjoint: a discrete adjoint ODE solver for first-order and second-order sensitivity analysis

We present a new software system PETSc TSAdjoint for first-order and sec...
research
03/15/2020

Second order adjoint sensitivity analysis in variational data assimilation for tsunami models

We mathematically derive the sensitivity of data assimilation results fo...
research
03/17/2021

Hessian Chain Bracketing

Second derivatives of mathematical models for real-world phenomena are f...
research
03/18/2022

Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration and Lower Bounds

We consider distributed optimization methods for problems where forming ...
research
06/23/2020

Inexact Derivative-Free Optimization for Bilevel Learning

Variational regularization techniques are dominant in the field of mathe...

Please sign up or login with your details

Forgot password? Click here to reset