## References

- [1] Assaraf, R. and Caffarel, M., 1999. Zero-variance principle for Monte Carlo algorithms. Physical Review Letters, 83(23), 4682.
- [2] Barp, A., Oates, C., Porcu, E. and Girolami, M., 2018. A Riemannian-Stein Kernel Method. arXiv preprint arXiv:1810.04946.
- [3] Belomestny, D., Iosipoi, L., Moulines, E., Naumov, A. and Samsonov, S., 2019. Variance reduction for Markov chains with application to MCMC. arXiv preprint arXiv:1910.03643.
- [4] Heng, J. and Jacob, P. E., 2019. Unbiased Hamiltonian Monte Carlo with couplings. Biometrika, 106(2), pp.287-302.
- [5] Jacob, P.E., O’Leary, J. and Atchadé, Y.F., 2020. Unbiased Markov chain Monte Carlo with couplings (with discussion and rejoinder). To appear in the Journal of the Royal Statistical Society: Series B (Statistical Methodology).
- [6] Mijatović, A. and Vogrinc, J., 2018. On the Poisson equation for Metropolis–Hastings chains. Bernoulli, 24(3), pp.2401-2428.
- [7] Mira, A., Solgi, R. and Imparato, D., 2013. Zero variance Markov chain Monte Carlo for Bayesian estimators. Statistics and Computing, 23(5), pp.653-662.
- [8] Oates, C.J., Girolami, M. and Chopin, N., 2017. Control functionals for Monte Carlo integration. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(3), pp.695-718.
- [9] South, L.F., Oates, C.J., Mira, A. and Drovandi, C., 2018. Regularised zero-variance control variates for high-dimensional variance reduction. arXiv preprint arXiv:1811.05073.

## Appendix A Derivation of the Upper Bound

The aim in what follows is to reproduce the proof of Proposition 1 in [5] whilst explicitly tracking the terms that are -dependent. To avoid reproducing large amounts of [5], we assume familiarity with the notation and quantities defined in that work.

The first part of the argument in [5] uses Assumption 1 to deduce that for some and all . Our first task is to explicitly compute the constant in terms of the quantities and in Assumption 1. To this end, we reproduce the argument alluded to in the paper:

It is then stated in the proof of Proposition 1 in [5] that where for some and all with ; we reproduce the implied argument to explicitly represent in terms of and next:

so we may take

(2) |

where is a -independent constant that depends only on the law of the meeting time for the Markov chains. The constant is finite since .

The stylised bound that we present is rooted in the concept of the *maximum mean discrepancy* associated to the reproducing kernel Hilbert space , defined as

If then we have from the definition of the maximum mean discrepancy that

Taking to be the law of thus gives that

Thus we may take the constant in Assumption 1 to be

(3) |

In what follows we let be a -independent constant that depends on the law of the Markov chain used. It is necessary to check that is finite. Let be the inner product in . The assumption that is a reproducing kernel Hilbert space means that , from the reproducing property and Cauchy-Schwarz. Since the kernel was assumed to satisfy , it follows that . Thus

Thus as required.

To complete the argument we proceed as follows:

where the final line follows from (3) and the fact that .