 # Discussion of "Unbiased Markov chain Monte Carlo with couplings" by Pierre E. Jacob, John O'Leary and Yves F. Atchadé

This is a contribution for the discussion on "Unbiased Markov chain Monte Carlo with couplings" by Pierre E. Jacob, John O'Leary and Yves F. Atchadé to appear in the Journal of the Royal Statistical Society Series B.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## Appendix A Derivation of the Upper Bound

The aim in what follows is to reproduce the proof of Proposition 1 in  whilst explicitly tracking the terms that are -dependent. To avoid reproducing large amounts of , we assume familiarity with the notation and quantities defined in that work.

The first part of the argument in  uses Assumption 1 to deduce that for some and all . Our first task is to explicitly compute the constant in terms of the quantities and in Assumption 1. To this end, we reproduce the argument alluded to in the paper:

 (E[|Δt|2+η])12+η = (E[|h(Xt)−h(Yt−1)|2+η])12+η ≤ (E[|h(Xt)|2+η])12+η+(E[|h(Yt−1)|2+η])12+η(Minkowski's inequality) ≤ D12+η+D12+η(Assumption% 1) ⟹E[Δ2t]=E[Δ2t1(τ>t)] ≤ E[|Δt|2+η]22+ηE[1(τ>t)]η2+η(H\"{o}lder's inequality) ≤ (2D12+η)2(Cδt)η2+η(Assumption 2) = 4Cη2+ηD22+η~δt=~C~δt,~C=4Cη2+ηD22+η.

It is then stated in the proof of Proposition 1 in  that where for some and all with ; we reproduce the implied argument to explicitly represent in terms of and next:

 E[(Hn′0(X,Y)−Hn0(X,Y))2] = n′∑s=n+1n′∑t=n+1E[ΔsΔt] ≤ n′∑s=n+1n′∑t=n+1E[Δ2s]1/2E[Δ2t]1/2(Cauchy-% Schwarz inequality) ≤ n′∑s=n+1n′∑t=n+1(~C~δs)1/2(~C~δt)1/2 = ~Cn′∑s=n+1(~δ1/2)s+n+1n′−n−1∑t=0(~δ1/2)t = ~Cn′∑s=n+1(~δ1/2)s+n+1⎛⎝1−(~δ1/2)n′−n1−~δ1/2⎞⎠ ≤ ~C11−~δ1/2n′∑s=n+1(~δ1/2)s+n+1 = ~C11−~δ1/2(~δ1/2)2n+2n′−n−1∑s=0(~δ1/2)s = ~C11−~δ1/2(~δ1/2)2n+2⎛⎝1−(~δ1/2)n′−n1−~δ1/2⎞⎠≤~C~δ(1−~δ1/2)2~δn

so we may take

 ¯C=~δ(1−~δ1/2)2~C=~δ(1−~δ1/2)2×4Cη2+ηD22+η=γ2D22+η,γ2:=4Cη2+ηδη2+η(1−δη4+2η)2 (2)

where is a -independent constant that depends only on the law of the meeting time for the Markov chains. The constant is finite since .

The stylised bound that we present is rooted in the concept of the maximum mean discrepancy associated to the reproducing kernel Hilbert space , defined as

 dH(π,π′):=sup∥f∥H≤1|π(f)−π′(f)|.

If then we have from the definition of the maximum mean discrepancy that

 |π(|h|2+η)−π′(|h|2+η)|≤∥|h|2+η∥HdH(π,π′).

Taking to be the law of thus gives that

 |π(|h|2+η)−E[|h(Xt)|2+η)] ≤ ∥|h|2+η∥HdH(π,πt) ⟹E[|h(Xt)|2+η] ≤ π(|h|2+η)+∥|h|2+η∥HdH(π,πt) ⟹supt≥0E[|h(Xt)|2+η] ≤ π(|h|2+η)+∥|h|2+η∥Hsupt≥0dH(π,πt).

Thus we may take the constant in Assumption 1 to be

 D=π(|h|2+η)+∥|h|2+η∥Hsupt≥0dH(π,πt). (3)

In what follows we let be a -independent constant that depends on the law of the Markov chain used. It is necessary to check that is finite. Let be the inner product in . The assumption that is a reproducing kernel Hilbert space means that , from the reproducing property and Cauchy-Schwarz. Since the kernel was assumed to satisfy , it follows that . Thus

 dH(π,π′)=sup∥h∥H≤1|π(h)−π′(h)|≤sup∥h∥∞≤1|π(h)−π′(h)|=dTV(π,π′).

Thus as required.

To complete the argument we proceed as follows:

 E[(Hn′0(X,Y)−Hn0(X,Y))2] ≤ ¯C~δn ⟹|E[Hn′0(X,Y)2]1/2−E[(Hn0(X,Y))2]1/2| ≤ (¯C~δn)1/2(reverse Minkowski % inequality) ⟹E[H0(X,Y)2]1/2 ≤ ¯C1/2+E[h(X0)2]1/2(taking n=0, n′=∞) ⟹σ(h) ≤ ¯C1/2+E[h(X0)2]1/2(since V[Z]≤E[Z2]) ≤ γD12+η+E[h(X0)2]1/2(from (???)) ≤ γ(π(|h|2+η)+λ∥|h|2+η∥H)12+η+E[h(X0)2]1/2

where the final line follows from (3) and the fact that .