1 Introduction
Models with unknown normalizing constants arise frequently in many different areas. Examples include Ising models [Isi25] in statistical physics, autologistic models [Bes72] [Bes74] in spatial statistics, exponential random graph models [RPKL07] in sociology, disease transmission models [PHLJH12] in epidemiology, and so on. The corresponding statistical inference problem can be formulated as follows.
Suppose we were given data
sampled from a family of probability densities (or probability mass distributions) of the form:
(1.1) 
We assume can be easily evaluated but the normalizing function is computationally intractable. Examples include:
Example 1 (Ising Model).
Consider a graph with nodes, each vertex is assigned with a spin , which is either or . A spin configuration is an assignment of spins to all the graph vertices. An Ising model on G is defined by the following Boltzmann distributions over all possible configurations:
(1.2) 
where is often referred to as the ‘Hamiltonian function’, is often referred to as the ‘partition function’. As there are different possible spin configurations, the normalizing constant is usually computationally intractable for moderately large .
Example 2 (Exponential Random Graph Model).
Exponential random graph models are a family of probability distributions on graphs. Let
be the set of all simple, undirect graphs without loops or multiple edges on vertices. Consider the following distribution on :(1.3) 
where is a sufficient statistics defined on . This may be chosen as the degrees of the vertices, the number of edges, the number of triangles, or other subgraph counts, is the normalizing constant. As there are up to possible graphs, is also computationally intractable for moderately large .
It is of natural interest to do inference on the parameter . It turns out that the classical route for statistical inference (maximum likelihood approach) can not be applied due to the intractability of . Current frequentist solutions are mainly based on approximation methods such as pseudolikelihood approximation [Bes74], MCMCMLE [Gey91], stochastic approximation [You88]. Usually frequentist approaches are computationally efficient but do not have theoretical guarantees. In fact, it is known that there are cases these approximation methods perform poorly, see [CMRT09] for discussion.
In a Bayesian prospective, suppose a prior is adopted. The posterior can be formally calculated by
. Then a central part of Bayesian inference is to understand the posterior distribution. For example, if one is able to (asymptotically) draw samples from the posterior (usually by Markovchain Monte Carlo algorithms), then the distribution of any function
of interest can be estimated by:
(1.4) 
where are samples drawn from .
However, the unknown normalizing function makes MCMC sampling pretty challenging. Consider a standard MetropolisHastings (MH) algorithm with proposal density , in each iteration the acceptance ratio is of the form:
(1.5) 
This can not be directly computed as the ratio is unknown. The posterior distribution is often referred to as a doublyintractable distribution as the MetropolisHastings algorithm is accurate only after infinity steps, and each iteration includes an infeasible calculation [MGM06].
One of the most popular methods to resolve this issue is the exchange algorithm [MGM06] proposed by Murray et al. Roughly speaking, the exchange algorithm is a new MCMC algorithm which uses an auxiliary variable at each step to estimate the unknown ratio (see Algorithm 2 for details). The algorithm is easy to implement and is asymptotically exact.
The exchange algorithm is widely used in sampling from doublyintractable distributions. However, there are very limited studies about its theoretical properties. One fundamental problem with the MCMC algorithm is its convergence rate. On the one hand, an apriori bound on how long the chain should run to converge within any given accuracy would be helpful to guide practical uses. On the other hand, present theories show there are deep connections between the convergence rate and Markovchain Central Limit Theorem. A chain with a subgeometric convergence rate may fail to admit the Central Limit Theorem and the estimator derived by Markov chain samples may even have infinite variance.
This motivates us to study the theoretical properties of the exchange algorithm. Our main contributions include:

We prove several comparisontype results between the exchange algorithm and the original MetropolisHastings algorithm. Our results compare the exchange algorithm and the MetropolisHastings algorithm in terms of asymptotic variance and convergence rate. We establish both necessary and sufficient conditions for the exchange chain to have ‘good’ convergence properties in the sense of geometric ergodicity.

We apply our theoretical results on a variety of practical examples such as location models, Ising models, exponential random graph models which include many of the practical applications of exchange algorithms. Our results justify the theoretical usefulness of the exchange algorithm in practical situations. To our best knowledge, this is the first result to establish geometric ergodicity for the exchange algorithm on noncompact parameter space.

We prove a Central Limit Theorem for the exchange algorithm given that it is geometrically ergodic. We also provide lower and upper bounds for the asymptotic variance of the exchange algorithm.

We compare the exchange algorithm with Pseudomarginal Monte Carlo, another popular method to tackle the doublyintractable distribution. We design a hybrid algorithm called Multiarmed Bandit Monte Carlo which interpolates between the two algorithms and achieves higher acceptance probability.
The remainder of the paper is organized as follows. In Section 2 we set up preliminary definitions and review current related results. Our main findings are stated and proved in Section 3
. Our results are heuristically summarized in Section
3.2, with proofs provided in Section 3.3. Section 3.4 shows the theorem proved in Section 3.3 can be applied to many practical models, including location models, Poisson models, a large subset of exponential family models which contains ERGMs and Ising Models. Section 4 studies a BetaBinomial example carefully. Quantitative bounds for both the MetropolisHastings chain and the exchange algorithm are provided. Section 5 concludes this paper and provides further possible directions.2 Preliminaries
2.1 The Exchange Algorithm
Let be the parameter space and let be the target density on , the standard MetropolisHastings Algorithm (MHMC) is described in Algorithm 1.
Input: initial setting , number of iterations , Markov transition kernel
However, in our setting the posterior density has the following expression:
where is an unknown function of .
Therefore, at each step the acceptance ratio
(2.1) 
contains an intractable term .
The exchange algorithm described below in Algorithm 2 is a clever extension of MHMC which uses an auxiliary variable at each step to estimate the unknown ratio of . Throughout this paper, we will use to denote the Markov chain of MHMC algorithm, and to denote the Markov chain of exchange algorithm.
Input: initial setting , number of iterations
If we compare exchange algorithm with MetropolisHastings Algorithm (Algorithm 1), it turns out the only difference is the uncomputable ratio
appeared in Algorithm 1 is replaced by
in Algorithm 2, where is the auxiliary variable generated in each step. Roughly speaking, the exchange algorithm uses the importance samplingtype estimator
to estimate
and plugs it into the acceptance ratio in MetropolisHastings algorithm. The exchange algorithm is easy to implement and is simple in the sense that it differs from the original MetropolisHastings algorithm by only an extra auxiliary variable in one step. Meanwhile, the estimator is cleverly designed so the correct stationary distribution is still preserved. The strong assumption is the ability of drawing exact samples from at Step 3, which is highly nontrivial and often impractical.
In its original paper, the use of the exchange algorithm is illustrated by a stylized example with likelihood, as well as an example with the Ising model. Practitioners also use exchange algorithm in Exponential Random Graph Model (ERGM) [CF11], spatial autoregressive (SAR) model [HL16], Bayesian hypothesis testing [DW18] and so on. However, theoretical studies for doubly intractable distributions and the exchange algorithm are still very limited. Murray et al. proved the detailedbalance equation holds for the exchange algorithm in their original paper [MGM06]. Nicholls et al. [NFW12] gave a sufficient condition for a minorization condition of the exchange chains. Habeck et al. [HRS20] provided stability properties of doublyintractable distributions. MedinaAguayo et al. [MARS20] provided guarantees for the Monte Carlo within Metropolis algorithm for approximate sampling of doubly intractable distributions. However, it seems the only existing result concerning the convergence rate of the exchange algorithm is in [NFW12], but the conditions proposed in [NFW12] seems to be strong and are generally not satisfied in unbounded parameter space, which is of practitioner’s main interest. For example, geometric ergodicity is the usual notion of a chain having a ‘good’ convergence rate. But there is no result showing whether the exchange algorithm is geometrically ergodic or not. This motivates us to study the theoretical properties of the exchange algorithm.
It is worth mentioning that, there are numerous other sampling algorithms besides the exchange algorithm to tackle the ‘doubly intractable distribution’. Those algorithms are broadly divided into two classes: ‘exact algorithms’ and ‘noisy algorithms’. As their names suggest, ‘exact algorithms’ usually have the correct stationary distribution , while ‘noisy algorithms’ only follow the target distribution approximately, even after infinitely many steps. Numerous algorithms such as pseudomarginal MCMC [MPRB06], exchange algorithm [MGM06], noisy MCMC [AFEB16], Doubly MetropolisHastings sampler [Lia10] are proposed in each category. Usually, exact algorithms have the virtue of sampling from the correct target distribution in the asymptotic sense but usually requires strict assumption such as perfect sampling. The comparison between these algorithms is beyond the scope of this paper and we refer the readers to an excellent survey [PH18].
In particular, Pseudomarginal Monte Carlo (PMC) is another popular approach to tackle the doubleintractable distribution and shares many similarities with the exchange algorithm. PMC is first proposed in the seminal paper [Bea03] by Beaumont in 2003. In 2006, Møller first used the PMC approach to sample from doublyintractable posterior distributions. PMC can be viewed as a generalization of the popular data augmentation schemes which samples jointly from parameter space and sample space . PMC is similar to the exchange algorithm as they are both ‘exact algorithms’ and they both use the ‘importancesampling’ type estimator to estimate the unknown ratio of normalizing constants. A comparison between the estimators used in PMC and the exchange algorithm is discussed in [Wan19]. Meanwhile, many important progresses have been made on the theoretical properties of PMC, see [AR09] and [AV15] for discussions. But the theoretical results for the exchange algorithm is still very limited. This is another motivation of this paper.
2.2 Markov Chain Convergence
Let be a Markov chain on with transition kernel . It is further assumed that is absolute continuous with density with respect to the Lebesgue measure , except perhaps for an atom .
If is a probability measure such that
is called the stationary distribution of . Furthermore, is called reversible if it satisfies a detailedbalance equation, that is:
By design, MetropolisHastings algorithms are reversible Markov chains with target distribution as stationary distribution.
Let be the probability measure after running the chain steps with starting point . The distance between with is measured by total variation distance, defined as follows:
Definition 1.
Let , be two probability measures on a sigmaalgebra of subsets of a sample space , the total variation distance between and is defined as:
We now make note of two simple properties which turn out to be necessary for analyzing Markov chain convergence and are satisfied by most chains of our interest:
Definition 2 (irreducible).
A Markov Chain is irreducible if there exists a nonzero finite measure on such that for all measurable with , for all , there exists positive integer such that
Roughly speaking, irreducible ensures every subset with positive measure will be reached from anywhere in the state space.
Definition 3 (Aperiodic).
A Markov Chain is aperiodic if there does not exist and disjoint subsets such that for all and .
Aperiodicity ensures the Markov chain will not periodically explore the state space.
The following theorem establishes the convergence for general state space Markov chain.
Theorem 1 ([Mt12], Chapter 13).
If a Markov chain on a state space with countably generated algebra is irreducible and aperiodic, and has a stationary distribution , then
as for a.e. .
Theorem 1 is widely applied in MCMC algorithms, as it is often very easy to check the MCMC chain is irreducible (with respect to or Lebesgue measure) and aperiodic.
On the other hand, however, Theorem 1 does not give us any quantitative result on convergence rate. Specifically, a Markov chain is said to be uniformly ergodic if
(2.2) 
for and .
A Markov chain is said to be geometrically ergodic if
(2.3) 
for some .
Geometric ergodicity is extremely important for analyzing Markov chains because:

If one can give sharp bounds on the righthand side of 2.3, practical users can calculate how many iterations are sufficient to get the total variation distance within any prefixed accuracy before running the algorithm.

As discussed in Theorem 1  Theorem 6 in [RR08]
, geometrically ergodic is closely related to the existence of a Markov chain Central Limit Theorem (CLT). Roughly speaking, for a reversible Markov chain (which is the case for MetropolisHastings chain), geometrically ergodic is essentially a necessary and sufficient condition to ensure a Markov chain CLT for all squareintegrable random variables. Therefore, a geometrically ergodic Markov chain is often considered a ‘good’ chain.
There are numerous studies concerning the convergence rate of Markov chains. For finite state space Markov chains, many techniques including geometric inequalities, Fourier analysis, multicommodity flows are used to bound the right hand side of 2.2, see [Sin92], [DS91], [DS81] for examples.
For general state Markov chains, geometric ergodicity is usually established by minorization and drift conditions, see [Ros95] [MT96] [RT96b] [MT12]. It is worth mentioning that the spectral characterization between geometric ergodicity and the spectrum of a reversible Markov chain is established in [RR97].
Geometric ergodicity for can often be established by [MT96], [RR97]. However, as we discussed above, in the doublyintractable setting, implementing MetropolisHastings chain is impractical, we are more interested in studying the convergence of the exchange algorithm.
To establish conditions that ensure uniformly or geometrically ergodic, we need the concepts of small set.
Definition 4.
A set is called small if there exists an integer , , and a probability measure such that, for any ,
(2.4) 
Here 2.4 is also called minorization condition.
The following theorem about uniform ergodicity is taken from [MT12], Chapter 16.
Theorem 2 (Uniform Ergodicity).
For a Markov chain , the followings are equivalent:

A minorization condition holds for the whole space , i.e., there exists an integer , , and a probability measure such that, for any ,

The chain is uniformly ergodic, and
When the state space is finite, then all the aperiodic, irreducible Markov chains are uniformly ergodic. However, when the state space is infinite, most chains are not uniformly ergodic. The following theorem is taken from Meyn and Tweedie [MT12], Chapter 15 and 16, and Roberts and Rosenthal [RR97], Proposition 1 and Theorem 2, which characterizes geometric ergodicity.
Theorem 3 (Geometric Ergodicity).
For a reversible Markov chain , the followings are equivalent:

is geometrically ergodic

There exists a function , finite at least for one point, and a small set , such that for some , :
(2.5) for all

There exists such that . Here . Here is viewed as an operator on , where is the stationary distribution.
Furthermore, it is shown by Mengerson and Tweedie [MT96] that a symmetric randomwalk Metropolis algorithm is geometrically ergodic essentially if and only if
has finite exponential moments. Therefore, if the posterior distribution has exponentially decreasing tails, the symmetric randomwalk Metropolis algorithm would be geometrically ergodic. Since conditions in Mengerson and Tweedie
[MT96] are often easier to check, and satisfied by a large family of practical MetropolisHastings algorithms, people often use [MT96] to conclude an MCMC algorithm is geometrically ergodic.3 Theoretical Results
3.1 Asymptotic variance and Peskun’s ordering
We start by proving the following simple but useful lemma, indicating that the exchange chain is always less statistically efficient comparing with the original MH chain:
This lemma shows that, the exchange algorithm is less likely to make a move comparing with the original MH algorithm.
This lemma shows that, the exchange algorithm is less likely to make a move comparing with the original MH algorithm.
Proof.
By Jensen’s inequality (the function is concave).
Meanwhile,
Therefore
as desired. ∎
As the two chains have the same stationary distribution, this shows the in Peskun’s ordering (defined in Definition)[Pes73]. Then it follows directly from [Pes73] and [T98]:
Theorem 4.
Let be the set of all the integrable random variable with mean . Define
where is a Markov chain with initial distribution and transition kernel . Then
for all .
is also called the ‘asymptotic variance’. Theorem 4 proves the original chain has smaller asymptotic variance and is thus statistically more efficient than chain.
Remark 1.
Theorem 4 is not very surprising because in each iteration of the exchange algorithm, the ratio
can be viewed as an estimator to estimate the unknown quantity
. On the other hand, the standard MH chain uses
directly which can be viewed as an estimator with variance . Therefore it is not surprising that the original MH chain has a smaller asymptotic variance.
However, asymptotic variance is only one measurement to evaluate a Markov chain. Another natural way of evaluating a Markov chain is the speed of convergence to stationary distribution. Even though is dominated by in Peskun’s order, the following simple example shows it is possible that converges to stationary distribution uniformly faster than .
Example 3 (Two point example).
Let , where the parameter space only contains two points:
Suppose the observed data is only one single point . Suppose the prior measure on is defined by
It is not hard to compute the posterior measure:
which is a uniform measure on . We further assume the transition matrix equals
It is clear that all the moves of the MH chain will be accepted, hence has transition matrix:
On the other hand, the transition function for chain can be computed by:
Similarly
so the transition matrix would be
Therefore, with any initialization, converges to the stationary distribution after one step. However, never converges as it jumps back and forth between and .
Remark 2.
In Example 3 above,
never converges because it is a periodic chain, i.e., its smallest eigenvalue equals
. But even we assume is aperiodic, it is still possible that coverges slower than , as we could tilt the transition matrix above a little bit, for example, let the transision matrix for bethen chain still converges faster.
Remark 3.
Combining Theorem 4 and Example 3, we can conclude that

In terms of asymptotic variance, chain is at least as good as ,

In terms of distributional convergence, it is not possible to derive a general ordering between and chain.
The above results tell us the exchange chain might converge faster or more slowly than the original MH chain. In the rest part of this section, we will further investigate the convergence speed of .
3.2 Main Results
In this part, we study the convergence properties of the exchange algorithm. As exchange algorithm can be viewed as a variant of the standard MetropolisHastings algorithm. It is natural to ask ‘comparisontype’ questions. In usual settings, people implement the exchange algorithm as the MetropolisHastings algorithm can not be applied in the doubly intractable case. The intuition seems to suggest that the MH algorithm converges faster. Therefore, suppose one knows the convergence speed of one algorithm, one might want to ask if the other algorithm also has a similar convergence speed. For example, one can ask questions like:

Question 1: Suppose is geometrically ergodic, is the original chain also geometrically ergodic?
Or the reverse

Question 2: Suppose is geometrically ergodic, is the exchange chain geometrically ergodic as well? If not, can we find sufficient conditions to ensure ‘inherits’ the convergence rate of ?
We will answer both of the two questions in the rest of this section. The second question is practically more interesting. In real settings, usually we can study the convergence rate of , though is not practically implementable. Therefore theoretical guarantees of would justify the usefulness of the exchange algorithm.
Before everything is rigorously stated, we state our results in a heuristic way here.

(Question 1) If geometrically ergodic, there is no guarantee that also geometrically ergodic. In fact, Example 3 gives such a counterexample.

(Question 1) If geometrically ergodic, then any ‘lazy’ version (defined later) of must also be geometrically ergodic. This indicates essentially will not converge at a faster speed than the original chain.

(Question 2) If the MH chain is geometrically ergodic, we have an example to show the exchange algorithm is not necessarily geometrically ergodic.

(Question 2) If the MH chain is geometrically ergodic, we have established sufficient conditions to ensure the geometric ergodicity of the exchange algorithm.
3.3 Exchange chain convergence
Now we are ready to study the convergence properties for . Though Example 3 gives us a negative example, indicating the exchange chain may converge much faster than original chain. The next theorem shows, after making the original chain ‘lazy’, the original chain is no worse than the exchange chain, which answers Question 1 completely.
Theorem 5.
Suppose the exchange chain is uniformly/geometrically ergodic, then for any , chain is also uniformly/geometrically ergodic. Here is the lazy version of , defined by
Proof.
First suppose is uniformly ergodic, then Theorem 2 shows there exists here exists an integer , , and a probability measure such that, for any ,
On the other hand, for any measurable set and any point , if , then we have
where . The same result holds if . Therefore,
for all . Thus is uniformly ergodic by Theorem 2.
Let and . In particular let and . Then we have
Meanwhile, as the exchange chain is dominated by the original MH chain in Peskun’s ordering, it is shown by Tierney [T98] Lemma 3 that the supremum of ’s spectrum should be no less than the supremum of ’s spectrum. That is, .
Therefore
Since
, there exists such that
. The lazy version of MH algorithm is thus geometrically ergodic, as desired. ∎
The next result follow immediately from Theorem 5.
Corollary 1.
If has rejection probability (i.e. ) uniformly bounded below from . Suppose the exchange chain is uniformly/geometrically ergodic, the original chain is also uniformly/geometrically ergodic.
Theorem 5 and Corollary 1 essentially show that the exchange chain will not converge faster than the (slightly modified) original chain. On the other hand, this result does not provide much insight into the convergence speed of the exchange algorithm. Therefore, it is of practical users’ interest to establish geometric ergodicity of the exchange algorithm, which provides the user with “peace of mind” of the convergence speed. As shown in Theorem 5 and Corollary 1, geometric ergodicity of is essentially the necessary condition for geometric ergodicity of . However, the next example shows this condition is not sufficient.
Example 4 (Exponential likelihood with Gamma prior).
Suppose the likelihood is , that is,
prior distribution is . Under this setting, it is easy to compute the posterior distribution:
which is a distribution. Consider an independence MetropolisHastings sampler with an proposal, that is,
As the proposal is precisely the posterior, this independence MetropolisHastings chain will converge perfectly after one step, and therefore is obviously geometrically ergodic. On the other hand, the Markov kernal of is:
where
is the only solution for equation:
for any fixed .
We can also compute the ‘rejection probability’ at point :
Notice that for each fixed , when goes to infinity, we have
and
Therefore the first term of the integration goes to
as by Lebesgue’s dominated convergence theorem (with control function ). The second term goes to as .
Therefore, we have:
It is proved by Roberts and Tweedie [RT96a] (Thm 5.1) that if a MH chain is geometrically ergodic, then its rejection probability is necessarily bounded away from unity. Therefore, the exchange algorithm in this example is not geometric ergodic.
3.3.1 Geometric Ergodicity of the Exchange Algorithm: Compact Parameter Space
Example 4 answers half of Question 2. That is, if is geometrically ergodic, the exchange algorithm is not necessarily geometrically ergodic. Now we focus on establishing sufficient conditions such that will ‘inherit’ the convergence rate of . We start with the following lemma.
Lemma 2.
Let , be two reversible Markov transition kernels on the same state space eqiupped with algebra . If there exists such that for any and any measureable set , and is geometrically ergodic, then is geometrically ergodic.
Proof.
We can write as
where is a valid Markov kernel.
Let and . Then we have
similarly
This proves is geometrically ergodic, as desired. ∎
Lemma 2 can be used to establish sufficient conditions for ‘convergence rate inheritance’ of . Let be the set
such that the likelihood ratio has a positive lower bound. If the probability of under is bounded away from for any , then can ‘inherit’ convergence rate from . Nicholls et al. proved a similar result in [NFW12], but only covered the uniformly ergodic case.
Theorem 6.
Suppose there exists such that
for any . Suppose is uniformly/geometrically ergodic. Then the exchange chain is also uniformly/geometrically ergodic, respectively.
Proof.
For any , the exchange chain has transition density
Lemma 1 proves for all , therefore we have for any .
If is uniformly ergodic, Theorem 2 proves there exists and such that for any . On the other hand, we have:
so also satisfies the minorization condition for the whole space. Thus is uniformly ergodic.
If is geometrically ergodic, then Lemma 2 implies is geometrically ergodic. ∎
Theorem 2 can be applied to almost all the cases where the parameter space is bounded or compact. We provide one example here:
Example 5 (BetaBinomial model).
Consider the following BetaBinomial example. Let
be the binomial distribution with parameter
, letbe the prior on , where are prefixed positive numbers. Given data , we would like to sample from the posterior distribution with density
which is a distributed random variable.
Consider an independence MetropolisHastings sampler with a proposal distribution , then has transition kernel:
Let be the maximizer for , for , we have:
and is therefore uniformly ergodic by Theorem 2. Furthermore, given any , the set has a positive probability under , since only take values from a compact set , there exists a constant such that
uniformly. Hence the exchange chain is uniformly ergodic by Theorem 6.
Remark 4.
Example 5 is only for illustration and the bound is obviously crude. We will go back to this example and calculate the exact rates of convergence in later sections.
However, the uniform probability condition for is usually too strong for general state space Markov chains. Note that
Therefore the condition directly implies uniformly over . However, when the parameter space is is unbounded (say ), most practical models will have and far away from each other, i.e,
Thus Theorem 6 can not be directly applied in these cases. The next part gives more practical sufficient conditions which can be applied in unbounded parameter space.
3.3.2 Geometrically Ergodicity of Exchange Algorithm: Noncompact Parameter Space
To establish sufficient conditions on general parameter space, we will first need the following assumptions on the , the conditions are similar to [MT96][RR97] and are reasonable in practical settings. To fix ideas, we will only state the conditions for Markov chain with target distribution in , though all our results can be carried through to higher dimensions after appropriate modifications, see [RR97] for discussion.
We will say that a MetropolisHastings chain satisfies assumption () if it:

has a randomwalk transition kernel , that is, where is a continuous density,

has a target density which is logconcave in the tails, that is, there exists some constant and such that, for all :
and for all :
The first assumption requires a randomwalk proposal kernel, which is very common in practical uses. In practice people often use uniform distribution
as proposal kernel, which both satisfies .The second assumption requires the tail of the target density to be uniformly exponential or lighter. This covers many posterior distributions as Gaussian, Gamma, exponential from typicallyused statistical models. As we will see later, the section assumption is also satisfied by Ising models and Exponential Random Graph Models with Gaussian priors. Indeed, it is proved in [MT96] that condition is essentially necessary for a randomwalk MetropolisHastings algorithm to be geometrically ergodic.
The following theorem gives a sufficient condition for the geometric ergodicity of
Theorem 7.
Let be a MetropolisHastings chain on with posterior distribution as stationary distribution. Suppose satisfies Assumption . Furthermore, assume there exists a nonconstant continuous function with such that
Comments
There are no comments yet.