Asymptotic behaviour of learning rates in Armijo's condition

07/07/2020
by   Tuyen Trung Truong, et al.
0

Fix a constant 0<α <1. For a C^1 function f:ℝ^k→ℝ, a point x and a positive number δ >0, we say that Armijo's condition is satisfied if f(x-δ∇ f(x))-f(x)≤ -αδ ||∇ f(x)||^2. It is a basis for the well known Backtracking Gradient Descent (Backtracking GD) algorithm. Consider a sequence {x_n} defined by x_n+1=x_n-δ _n∇ f(x_n), for positive numbers δ _n for which Armijo's condition is satisfied. We show that if {x_n} converges to a non-degenerate critical point, then {δ _n} must be bounded. Moreover this boundedness can be quantified in terms of the norms of the Hessian ∇ ^2f and its inverse at the limit point. This complements the first author's results on Unbounded Backtracking GD, and shows that in case of convergence to a non-degenerate critical point the behaviour of Unbounded Backtracking GD is not too different from that of usual Backtracking GD. On the other hand, in case of convergence to a degenerate critical point the behaviours can be very much different. We run some experiments to illustrate that both scenrios can really happen. In another part of the paper, we argue that Backtracking GD has the correct unit (according to a definition by Zeiler in his Adadelta's paper). The main point is that since learning rate in Backtracking GD is bound by Armijo's condition, it is not unitless.

READ FULL TEXT
research
01/07/2020

Backtracking Gradient Descent allowing unbounded learning rates

In unconstrained optimisation on an Euclidean space, to prove convergenc...
research
11/11/2019

Convergence to minima for the continuous version of Backtracking Gradient Descent

The main result of this paper is: Theorem. Let f:R^k→R be a C^1 funct...
research
04/16/2022

Riemannian optimization using three different metrics for Hermitian PSD fixed-rank constraints: an extended version

We consider smooth optimization problems with a Hermitian positive semi-...
research
01/16/2020

Some convergent results for Backtracking Gradient Descent method on Banach spaces

Our main result concerns the following condition: Condition C. Let X ...
research
10/11/2022

Critical Points at Infinity for Hyperplanes of Directions

Analytic combinatorics in several variables (ACSV) analyzes the asymptot...
research
11/18/2019

Coordinate-wise Armijo's condition

Let z=(x,y) be coordinates for the product space R^m_1×R^m_2. Let f:R^m_...

Please sign up or login with your details

Forgot password? Click here to reset