On Markov Chain Gradient Descent

09/12/2018
by   Tao Sun, et al.
8

Stochastic gradient methods are the workhorse (algorithms) of large-scale optimization problems in machine learning, signal processing, and other computational sciences and engineering. This paper studies Markov chain gradient descent, a variant of stochastic gradient descent where the random samples are taken on the trajectory of a Markov chain. Existing results of this method assume convex objectives and a reversible Markov chain and thus have their limitations. We establish new non-ergodic convergence under wider step sizes, for nonconvex problems, and for non-reversible finite-state Markov chains. Nonconvexity makes our method applicable to broader problem classes. Non-reversible finite-state Markov chains, on the other hand, can mix substatially faster. To obtain these results, we introduce a new technique that varies the mixing levels of the Markov chains. The reported numerical results validate our contributions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2019

Decentralized Markov Chain Gradient Descent

Decentralized stochastic gradient method emerges as a promising solution...
research
11/22/2018

Markov Chain Block Coordinate Descent

The method of block coordinate gradient descent (BCD) has been a powerfu...
research
08/20/2020

Markov Chain-Based Stochastic Strategies for Robotic Surveillance

This article surveys recent advancements of strategy designs for persist...
research
06/13/2022

Markov Chain Score Ascent: A Unifying Framework of Variational Inference with Markovian Gradients

Minimizing the inclusive Kullback-Leibler (KL) divergence with stochasti...
research
03/17/2014

A reversible infinite HMM using normalised random measures

We present a nonparametric prior over reversible Markov chains. We use c...
research
11/20/2022

Non-reversible Parallel Tempering for Deep Posterior Approximation

Parallel tempering (PT), also known as replica exchange, is the go-to wo...
research
12/14/2019

Mixing Time Estimation in Ergodic Markov Chains from a Single Trajectory with Contraction Methods

The mixing time t_mix of an ergodic Markov chain measures the rate of co...

Please sign up or login with your details

Forgot password? Click here to reset