An Anderson-Chebyshev Mixing Method for Nonlinear Optimization

09/07/2018
by   Zhize Li, et al.
0

Anderson mixing (or Anderson acceleration) is an efficient acceleration method for fixed point iterations (i.e., x_t+1=G(x_t)), e.g., gradient descent can be viewed as iteratively applying the operation G(x) = x-α∇ f(x). It is known that Anderson mixing is quite efficient in practice and can be viewed as an extension of Krylov subspace methods for nonlinear problems. First, we show that Anderson mixing with Chebyshev polynomial parameters can achieve the optimal convergence rate O(√(κ)1/ϵ), which improves the previous result O(κ1/ϵ) provided by [Toth and Kelley, 2015] for quadratic functions. Then, we provide a convergence analysis for minimizing general nonlinear problems. Besides, if the hyperparameters (e.g., the Lipschitz smooth parameter L) are not available, we propose a Guessing Algorithm for guessing them dynamically and also prove a similar convergence rate. Finally, the experimental results demonstrate that the proposed Anderson-Chebyshev mixing method converges significantly faster than other algorithms, e.g., vanilla gradient descent (GD), Nesterov's Accelerated GD. Also, these algorithms combined with the proposed guessing algorithm (guessing the hyperparameters dynamically) achieve much better performance.

READ FULL TEXT
research
09/07/2018

A Fast Anderson-Chebyshev Mixing Method for Nonlinear Optimization

Anderson mixing (or Anderson acceleration) is an efficient acceleration ...
research
11/14/2014

Stochastic Compositional Gradient Descent: Algorithms for Minimizing Compositions of Expected-Value Functions

Classical stochastic gradient methods are well suited for minimizing exp...
research
06/22/2023

Iteratively Preconditioned Gradient-Descent Approach for Moving Horizon Estimation Problems

Moving horizon estimation (MHE) is a widely studied state estimation app...
research
10/31/2019

Mixing of Stochastic Accelerated Gradient Descent

We study the mixing properties for stochastic accelerated gradient desce...
research
04/11/2021

Alternating cyclic extrapolation methods for optimization algorithms

This article introduces new acceleration methods for fixed point iterati...
research
12/17/2018

Accelerating Multigrid Optimization via SESOP

A merger of two optimization frameworks is introduced: SEquential Subspa...
research
10/26/2020

Convergence Acceleration via Chebyshev Step: Plausible Interpretation of Deep-Unfolded Gradient Descent

Deep unfolding is a promising deep-learning technique, whose network arc...

Please sign up or login with your details

Forgot password? Click here to reset