Optimizer Amalgamation

03/12/2022
by   Tianshu Huang, et al.
11

Selecting an appropriate optimizer for a given problem is of major interest for researchers and practitioners. Many analytical optimizers have been proposed using a variety of theoretical and empirical approaches; however, none can offer a universal advantage over other competitive optimizers. We are thus motivated to study a new problem named Optimizer Amalgamation: how can we best combine a pool of "teacher" optimizers into a single "student" optimizer that can have stronger problem-specific performance? In this paper, we draw inspiration from the field of "learning to optimize" to use a learnable amalgamation target. First, we define three differentiable amalgamation mechanisms to amalgamate a pool of analytical optimizers by gradient descent. Then, in order to reduce variance of the amalgamation process, we also explore methods to stabilize the amalgamation process by perturbing the amalgamation target. Finally, we present experiments showing the superiority of our amalgamated optimizer compared to its amalgamated components and learning to optimize baselines, and the efficacy of our variance reducing perturbations. Our code and pre-trained models are publicly available at http://github.com/VITA-Group/OptimizerAmalgamation.

READ FULL TEXT
research
08/30/2022

Cardinal Optimizer (COPT) User Guide

Cardinal Optimizer is a high-performance mathematical programming solver...
research
10/18/2020

Training Stronger Baselines for Learning to Optimize

Learning to optimize (L2O) has gained increasing attention since classic...
research
02/22/2023

Learning to Generalize Provably in Learning to Optimize

Learning to optimize (L2O) has gained increasing popularity, which autom...
research
02/28/2023

M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation

Learning to Optimize (L2O) has drawn increasing attention as it often re...
research
04/27/2018

An improvement of the convergence proof of the ADAM-Optimizer

A common way to train neural networks is the Backpropagation. This algor...
research
10/28/2020

Harris Hawks Optimization: Algorithm and Applications

In this paper, a novel population-based, nature-inspired optimization pa...
research
06/27/2023

SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate via Compiler Co-design

This paper introduces SparseOptimizer, a novel deep learning optimizer t...

Please sign up or login with your details

Forgot password? Click here to reset