Equilibrium and non-Equilibrium regimes in the learning of Restricted Boltzmann Machines

05/28/2021
by   Aurelien Decelle, et al.
0

Training Restricted Boltzmann Machines (RBMs) has been challenging for a long time due to the difficulty of computing precisely the log-likelihood gradient. Over the past decades, many works have proposed more or less successful training recipes but without studying the crucial quantity of the problem: the mixing time i.e. the number of Monte Carlo iterations needed to sample new configurations from a model. In this work, we show that this mixing time plays a crucial role in the dynamics and stability of the trained model, and that RBMs operate in two well-defined regimes, namely equilibrium and out-of-equilibrium, depending on the interplay between this mixing time of the model and the number of steps, k, used to approximate the gradient. We further show empirically that this mixing time increases with the learning, which often implies a transition from one regime to another as soon as k becomes smaller than this time. In particular, we show that using the popular k (persistent) contrastive divergence approaches, with k small, the dynamics of the learned model are extremely slow and often dominated by strong out-of-equilibrium effects. On the contrary, RBMs trained in equilibrium display faster dynamics, and a smooth convergence to dataset-like configurations during the sampling. Finally we discuss how to exploit in practice both regimes depending on the task one aims to fulfill: (i) short ks can be used to generate convincing samples in short times, (ii) large k (or increasingly large) must be used to learn the correct equilibrium distribution of the RBM.

READ FULL TEXT
research
06/02/2022

Learning a Restricted Boltzmann Machine using biased Monte Carlo sampling

Restricted Boltzmann Machines are simple and powerful generative models ...
research
10/10/2016

Accelerate Monte Carlo Simulations with Restricted Boltzmann Machines

Despite their exceptional flexibility and popularity, the Monte Carlo me...
research
07/13/2017

Inferring the parameters of a Markov process from snapshots of the steady state

We seek to infer the parameters of an ergodic Markov process from sample...
research
10/06/2015

Population-Contrastive-Divergence: Does Consistency help with RBM training?

Estimating the log-likelihood gradient with respect to the parameters of...
research
05/07/2020

Training and Classification using a Restricted Boltzmann Machine on the D-Wave 2000Q

Restricted Boltzmann Machine (RBM) is an energy based, undirected graphi...
research
07/13/2023

Fast and Functional Structured Data Generators Rooted in Out-of-Equilibrium Physics

In this study, we address the challenge of using energy-based models to ...
research
01/23/2023

Explaining the effects of non-convergent sampling in the training of Energy-Based Models

In this paper, we quantify the impact of using non-convergent Markov cha...

Please sign up or login with your details

Forgot password? Click here to reset