Some convergent results for Backtracking Gradient Descent method on Banach spaces

01/16/2020
by   Tuyen Trung Truong, et al.
0

Our main result concerns the following condition: Condition C. Let X be a Banach space. A C^1 function f:X→R satisfies Condition C if whenever {x_n} weakly converges to x and lim _n→∞||∇ f(x_n)||=0, then ∇ f(x)=0. We assume that there is given a canonical isomorphism between X and its dual X^*, for example when X is a Hilbert space. Theorem. Let X be a reflexive, complete Banach space and f:X→R be a C^2 function which satisfies Condition C. Moreover, we assume that for every bounded set S⊂ X, then sup _x∈ S||∇ ^2f(x)||<∞. We choose a random point x_0∈ X and construct by the Local Backtracking GD procedure (which depends on 3 hyper-parameters α ,β ,δ _0, see later for details) the sequence x_n+1=x_n-δ (x_n)∇ f(x_n). Then we have: 1) Every cluster point of {x_n}, in the weak topology, is a critical point of f. 2) Either lim _n→∞f(x_n)=-∞ or lim _n→∞||x_n+1-x_n||=0. 3) Here we work with the weak topology. Let C be the set of critical points of f. Assume that C has a bounded component A. Let B be the set of cluster points of {x_n}. If B∩ A≠∅, then B⊂ A and B is connected. 4) Assume that f has at most countably many saddle points. Then for generic choices of α ,β ,δ _0 and the initial point x_0, if the sequence {x_n} converges - in the weak topology, then the limit point cannot be a saddle point.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2019

Convergence to minima for the continuous version of Backtracking Gradient Descent

The main result of this paper is: Theorem. Let f:R^k→R be a C^1 funct...
research
11/18/2019

Coordinate-wise Armijo's condition

Let z=(x,y) be coordinates for the product space R^m_1×R^m_2. Let f:R^m_...
research
03/11/2020

Coordinate-wise Armijo's condition: General case

Let z=(x,y) be coordinates for the product space R^m_1×R^m_2. Let f:R^m_...
research
08/25/2020

Unconstrained optimisation on Riemannian manifolds

In this paper, we give explicit descriptions of versions of (Local-) Bac...
research
07/07/2020

Asymptotic behaviour of learning rates in Armijo's condition

Fix a constant 0<α <1. For a C^1 function f:ℝ^k→ℝ, a point x and a posit...
research
09/06/2021

Stochastic Subgradient Descent on a Generic Definable Function Converges to a Minimizer

It was previously shown by Davis and Drusvyatskiy that every Clarke crit...
research
10/25/2018

On the Convergence of the Polarization Process in the Noisiness/Weak-∗ Topology

Let W be a channel where the input alphabet is endowed with an Abelian g...

Please sign up or login with your details

Forgot password? Click here to reset