Simple Confidence Intervals for MCMC Without CLTs

12/01/2018 ∙ by Jeffrey S. Rosenthal, et al. ∙ 0

This short note argues that 95 be obtained even without establishing a CLT, by multiplying their widths by 2.3.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Markov chain Monte Carlo (MCMC) algorithms are very widely used to estimate of expected values in a variety of settings, especially for Bayesian inference (see e.g. Brooks et al., 2011, and the many references therein).

It has been pointed out by various authors (e.g. Jones and Hobert, 2001; Flegal et al., 2008) that in addition to providing an estimate, it is also important to quantify the error in the estimate, hopefully by providing confidence intervals for the value being estimated.

Such error estimation and confidence intervals are usually obtained via Markov chain Central Limit Theorems (CLTs), see e.g. Tierney (1994, Theorem 4), Chan and Geyer (1994), Jones (2004), Roberts and Rosenthal (2004), and Jones et al. (2006). Indeed, CLTs are often considered

essential for this purpose, e.g. Jones (2007, p. 131) writes “The CLT is the basis of all error estimation in Monte Carlo”. However, establishing CLTs for MCMC requires the verification of challenging properties like geometric ergodicity, which is often difficult in applied problems. This makes confidence intervals harder to obtain in MCMC applications.

In this short note, we show (Theorem LABEL:mainthm

) that for typical MCMC applications, as long as the asymptotic variance can be estimated, a confidence interval (or at least an

upper-bound on a confidence interval) can be obtained quite simply, via Chebychev’s inequality, without requiring any sort of CLT or distributional convergence at all.

2 Assumptions

Let be a Markov chain on a state space which converges to a target distribution . Let be some functional, and assume we wish to estimate the stationary expected value of , i.e. , by the usual MCMC estimate, .

In typical MCMC applications, the estimate will have variance and bias (see e.g. page 21 of Geyer, 2011). Consistent with this, we assume:

(A1) (Order variance.) The limit exists and is in .

(A2) (Smaller-order bias.) .

We also require an estimator of the asymptotic variance value . Such estimators are quite common, and can be obtained in many different ways, including repeated runs, integrated autocorrelation times, batch means, window estimators, regenerations, and more; see e.g. Section 3 of Geyer (1992), Hobert et al. (2002), Jones et al. (2006), Häggström and Rosenthal (2007), etc. We thus assume:

(A3) (Variance estimator.) There is an estimator with

in probability.

3 Main Result

Under the above mild assumptions, our result is as follows:

Theorem mainthm Assume (A1)–(A3) above, fix and , and define the interval

Then

i.e. the interval includes the true expected value with asymptotic probability at least , i.e.  has asymptotic coverage probability at least .

Theorem LABEL:mainthm may be interpreted as saying that the interval contains an asymptotic -confidence interval for , i.e. it is an overly-conservative confidence interval. Since the main purpose of MCMC confidence intervals is to provide approximate guarantees for estimates, this conservativeness is not a major limitation.

Most commonly, the significance level . In that case, the usual CLT-derived 95% asymptotic confidence interval for would be given by . By contrast, taking and , our interval is computed to be . So, Theorem LABEL:mainthm can be interpreted as saying that even without establishing a Markov chain CLT, the usual MCMC asymptotic 95% confidence interval still applies, except with “1.96” replaced by “4.48”, i.e. multiplying by just under 2.3 (and with the asymptotic coverage probability being instead of exactly 95%, i.e. being overly conservative). Given the difficulty of establishing CLTs for MCMC algorithms, it seems easier to instead simply multiply the confidence interval width by 2.3.

4 Proof of Theorem LABEL:mainthm

For any , we have by the triangle inequality that

Hence, if

then by Chebychev’s inequality (e.g. Rosenthal, 2006, Proposition 5.1.2),

We now set . Then by (A2), . Hence, is satisfied for all sufficiently large , and as , we have from the above and (A1) that

It remains to replace the true variance coefficient by its estimator . For this, let . Then by (A3), . Therefore, 3cm

Taking complements, we obtain that

Finally, note that if and only if . Hence, this completes the proof of Theorem LABEL:mainthm.

The recent paper Atchadé (2016) also obtains confidence intervals for MCMC without requiring CLTs. However, its results apply only to reversible chains, and require knowledge of the spectrum of a complicated kernel , and proceed by establishing convergence in distribution to a complicated generalised T-distribution which appears to be difficult and challenging to work with, so they cannot be described as “simple”.

I thank Jim Hobert for very useful comments, and thank the anonymous referees for helpful reports.

References

Y.F. Atchadé (2016), Markov Chain Monte Carlo confidence intervals. Bernoulli 22(3), 1808–1838.

S. Brooks, A. Gelman, G.L. Jones, and X.-L. Meng, eds. (2011), Handbook of Markov chain Monte Carlo. Chapman & Hall / CRC Press.

K.S. Chan and C.J. Geyer (1994), Discussion of Tierney (1994). Ann. Stat. 22, 1747–1758.

J.M. Flegal, M. Haran, and G.L. Jones (2008), Markov Chain Monte Carlo: Can We Trust the Third Significant Figure? Stat. Sci. 23(2), 250–260.

C.J. Geyer (1992), Practical Markov chain Monte Carlo. Stat. Sci. 7, 473-483.

C.J. Geyer (2011), Introduction to Markov chain Monte Carlo. Chapter 1 of Brooks et al. (2011).

O. Häggström and J.S. Rosenthal (2007), On Variance Conditions for Markov Chain CLTs. Elec. Comm. Prob. 12, 454–464.

J.P. Hobert, G.L. Jones, B. Presnell, and J.S. Rosenthal (2002), On the Applicability of Regenerative Simulation in Markov Chain Monte Carlo. Biometrika 89, 731–743.

G.L. Jones (2004), On the Markov chain central limit theorem. Prob. Surv. 1, 299–320.

G.L. Jones, M. Haran, B.S. Caffo, and R. Neath (2006), Fixed Width Output Analysis for Markov chain Monte Carlo. J. Amer. Stat. Assoc. 101, 1537–1547.

G.L. Jones (2007), Course notes for STAT 8701: Computational Statistical Methods.

G.L. Jones and J.P. Hobert (2001), Honest exploration of intractable probability distributions via Markov chain Monte Carlo. Statistical Science

16, 312–334.

G.O. Roberts and J.S. Rosenthal (2004), General state space Markov chains and MCMC algorithms. Prob. Surv. 1, 20–71.

J.S. Rosenthal (2006), A first look at rigorous probability theory, 2nd ed. World Scientific Publishing Company, Singapore.

L. Tierney (1994), Markov chains for exploring posterior distributions (with discussion). Ann. Stat. 22, 1701–1762.