Convergence of Contrastive Divergence with Annealed Learning Rate in Exponential Family

05/20/2016
by   Bai Jiang, et al.
0

In our recent paper, we showed that in exponential family, contrastive divergence (CD) with fixed learning rate will give asymptotically consistent estimates wu2016convergence. In this paper, we establish consistency and convergence rate of CD with annealed learning rate η_t. Specifically, suppose CD-m generates the sequence of parameters {θ_t}_t > 0 using an i.i.d. data sample X_1^n ∼ p_θ^* of size n, then δ_n(X_1^n) = _t →∞∑_s=t_0^t η_s θ_s / ∑_s=t_0^t η_s - θ^* converges in probability to 0 at a rate of 1/√(n). The number (m) of MCMC transitions in CD only affects the coefficient factor of convergence rate. Our proof is not a simple extension of the one in wu2016convergence. which depends critically on the fact that {θ_t}_t > 0 is a homogeneous Markov chain conditional on the observed sample X_1^n. Under annealed learning rate, the homogeneous Markov property is not available and we have to develop an alternative approach based on super-martingales. Experiment results of CD on a fully-visible 2× 2 Boltzmann Machine are provided to demonstrate our theoretical results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/17/2016

Convergence of Contrastive Divergence Algorithm in Exponential Family

This paper studies the convergence properties of contrastive divergence ...
research
02/22/2021

Super-Convergence with an Unstable Learning Rate

Conventional wisdom dictates that learning rate should be in the stable ...
research
02/24/2022

An optimal scheduled learning rate for a randomized Kaczmarz algorithm

We study how the learning rate affects the performance of a relaxed rand...
research
10/07/2021

Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect

Recent empirical advances show that training deep models with large lear...
research
05/11/2023

On the convergence of the MLE as an estimator of the learning rate in the Exp3 algorithm

When fitting the learning data of an individual to algorithm-like learni...
research
06/04/2015

Rivalry of Two Families of Algorithms for Memory-Restricted Streaming PCA

We study the problem of recovering the subspace spanned by the first k p...
research
12/06/2020

Contrastive Divergence Learning is a Time Reversal Adversarial Game

Contrastive divergence (CD) learning is a classical method for fitting u...

Please sign up or login with your details

Forgot password? Click here to reset