Transformer-Based Learned Optimization

12/02/2022
by   Erik Gärtner, et al.
0

In this paper, we propose a new approach to learned optimization. As common in the literature, we represent the computation of the update step of the optimizer with a neural network. The parameters of the optimizer are then learned on a set of training optimization tasks, in order to perform minimisation efficiently. Our main innovation is to propose a new neural network architecture for the learned optimizer inspired by the classic BFGS algorithm. As in BFGS, we estimate a preconditioning matrix as a sum of rank-one updates but use a transformer-based neural network to predict these updates jointly with the step length and direction. In contrast to several recent learned optimization approaches, our formulation allows for conditioning across different dimensions of the parameter space of the target problem while remaining applicable to optimization tasks of variable dimensionality without retraining. We demonstrate the advantages of our approach on a benchmark composed of objective functions traditionally used for evaluation of optimization algorithms, as well as on the real world-task of physics-based reconstruction of articulated 3D human motion.

READ FULL TEXT

page 7

page 10

page 13

research
10/21/2019

Learning to Learn by Zeroth-Order Oracle

In the learning to learn (L2L) framework, we cast the design of optimiza...
research
02/03/2021

The Archerfish Hunting Optimizer: a novel metaheuristic algorithm for global optimization

Global optimization solves real-world problems numerically or analytical...
research
11/22/2018

HyperAdam: A Learnable Task-Adaptive Adam for Network Training

Deep neural networks are traditionally trained using human-designed stoc...
research
07/24/2023

An Isometric Stochastic Optimizer

The Adam optimizer is the standard choice in deep learning applications....
research
09/09/2018

LS-Net: Learning to Solve Nonlinear Least Squares for Monocular Stereo

Sum-of-squares objective functions are very popular in computer vision a...
research
06/26/2020

Relative gradient optimization of the Jacobian term in unsupervised deep learning

Learning expressive probabilistic models correctly describing the data i...
research
06/05/2019

Risks from Learned Optimization in Advanced Machine Learning Systems

We analyze the type of learned optimization that occurs when a learned m...

Please sign up or login with your details

Forgot password? Click here to reset