A principled stopping rule for importance sampling

Importance sampling (IS) is a Monte Carlo technique that relies on weighted samples, simulated from a proposal distribution, to estimate intractable integrals. The quality of the estimators improves with the number of samples. However, for achieving a desired quality of estimation, the required number of samples is unknown, and depends on the quantity of interest, the estimator, and the chosen proposal. We present a sequential stopping rule that terminates simulation when the overall variability in estimation is relatively small. The proposed methodology closely connects to the idea of an effective sample size in IS and overcomes crucial shortcomings of existing metrics, e.g., it acknowledges multivariate estimation problems. Our stopping rule retains asymptotic guarantees and provides users a clear guideline on when to stop the simulation in IS.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

06/20/2019

Pushing the Limits of Importance Sampling through Iterative Moment Matching

The accuracy of an integral approximation via Monte Carlo sampling depen...
09/11/2018

Rethinking the Effective Sample Size

The effective sample size (ESS) is widely used in sample-based simulatio...
09/22/2020

Bayesian Update with Importance Sampling: Required Sample Size

Importance sampling is used to approximate Bayes' rule in many computati...
03/28/2019

Convergence rates for optimised adaptive importance samplers

Adaptive importance samplers are adaptive Monte Carlo algorithms to esti...
11/18/2017

Optimal Stopping for Interval Estimation in Bernoulli Trials

We propose an optimal sequential methodology for obtaining confidence in...
09/24/2019

Efficient Estimation of the Left Tail of Bimodal Distributions with Applications to Underwater Optical Communication Systems

In this paper, we propose efficient importance sampling estimators to ev...
06/22/2018

Tensor Monte Carlo: particle methods for the GPU era

Multi-sample objectives improve over single-sample estimates by giving t...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In a wide variety of applications, a key problem of interest is the estimation of intractable integrals through Monte Carlo techniques. Specifically, let be a target distribution on with associated density function also denoted by . Suppose and interest is in estimating

(1)

Under independent and identically distributed (iid) sampling from , the vanilla Monte Carlo estimator for is

(2)

While iid sampling from the target density is often desirable, it may not always be feasible either due to the computational burden or the inefficiency of the resulting estimators (see Denny, 2001)

. In such cases, practitioners may resort to other Monte Carlo methods such as Markov chain Monte Carlo (MCMC)

(Robert and Casella, 2013; Ekvall and Jones, 2014) or IS (Kahn, 1950a, b; Elvira and Martino, 2021).

Importance sampling is a popular Monte Carlo technique often used for variance reduction. In IS, samples from a proxy

proposal distribution are simulated and weighted averages of desired functions are calculated. Specifically, let denote the density of a chosen proposal distribution and for a known or unknown normalizing constant , let be such that it can be evaluated (often called unnormalized target) for every . Random samples, drawn from the distribution with density are assigned importance weights

(3)

Based on being known or unknown, weighted averages of yield estimators of , which we will generically denote as . Often, is the unnormalized IS (UIS) estimator or the self-normalized IS (SNIS) estimator. Alternative IS estimators have also been proposed for efficient estimation (see Vehtari et al., 2015; Elvira et al., 2019; Kuntz et al., 2021; Martino et al., 2018).

In addition to the target and proposal distributions, variability in is critically dependent on: (i) the function of interest and (ii) the choice of the estimator. Almost all methods assessing the quality of an IS algorithm only utilize the importance weights and do not factor or the estimator employed to estimate . A key question that thus remains unanswered is, how should be chosen so that is a good estimate of ?

Common diagnostics for IS algorithms measure the discrepancy between the target and proposal distributions. Chatterjee et al. (2018)

obtain sample size requirements by utilizing the Kullback-Leibler divergence between

and . Sanz-Alonso (2018) and Sanz-Alonso and Wang (2021) leverage the divergence between and to quantify the variance of the weights and obtain an estimate of the necessary sample size. These methods are useful in understanding the quality of the weights, but since they are invariant to the choice of and , they are not equipped to directly explain the quality of estimation of by .

Another popular diagnostic in IS is the effective sample size (ESS) which is meant to provide the number of iid samples from that would yield the same variability in as the variability in . Using a series of simplifying assumptions, Kong (1992) provides a popularly employed estimator of ESS. As described by Elvira et al. (2018), the simplifying assumptions make it so that the resulting estimator does not satisfy some key desirable properties. As a consequence, although a reasonable diagnostic for assessing the suitability of the proposal distribution, the estimator fails to truly assess the quality of estimation of . Nevertheless, it continues to be used in IS as a practical diagnostic owing to reasonable statistical properties, ease of implementation, and lack of better alternatives (see Elvira et al. (2018) for more details).

Due to interest in several expectations, often and its estimator are multivariate. Naturally, the correlation among components of impacts the quality of estimation. Since most IS diagnostics do not depend on the choice of the function , this correlation is left ignored.

The contributions of this paper are twofold. First, we present a multivariate analogue of the original definition of ESS, avoiding most of the simplifying assumptions of Kong (1992). The proposed metric, that we call M-ESS, acknowledges that changing , , and the estimator employed should yield different quality of estimation. Second, we adapt and integrate multivariate sequential stopping rule techniques of Glynn and Whitt (1992) to determine when enough weighted samples have been obtained in IS. Our proposed method stops simulation when the volume of the confidence region for is small, relative to the variability of under

. We show that the confidence region constructed at the random time of termination is asymptotically valid, i.e., the probability that

is contained in the confidence region at termination is asymptotically . Moreover, adapting ideas from the MCMC literature (Vats et al., 2019), we show that this stopping rule is asymptotically equivalent to stopping the IS simulation when the estimated M-ESS is larger than an a priori obtained lower bound. The proposed methodology provides users a clear guideline on when to stop their simulations.

We implement our proposed ESS and stopping rules in two examples. First, we set both and to be multivariate Gaussian and analytically obtain the true variance of two different IS estimators. Using these variances, we analyze the quality of estimation of M-ESS and the performance of our termination rule as a function of the problem dimension and the degree of correlation in the resulting estimators. Our second example is of a Bayesian Weibull multi-step step-stress model. Here, we employ our ESS for two different multivariate of interest to demonstrate the difference in the quality of estimation for different choices of .

2 Importance Sampling and Effective Sample Size

We recall that interest is in estimating , where and may be such that

(4)

We assume that is known (can be evaluated at any ) while may be known or unknown; without loss of generality, we assume when

can be completely evaluated. For instance, in Bayesian inference,

is the intractable marginal likelihood. Based on the availability of , one of the two popular IS estimators may be employed. There certainly are situations where may also be unavailable (see Park and Haran, 2018); we exclude these from consideration in this work.

Let be an appropriately chosen proposal distribution and be iid samples from . If is known, may be estimated using the UIS estimator

(5)

When is unknown, the UIS estimator cannot be calculated, in which case can be estimated by the SNIS estimator

(6)

Importance sampling requires that for all such that . Both and converge to with probability 1, as ; see Owen (2013); Robert and Casella (2013) for more details. Recall, we denote any IS estimator of as , and will use the specific notations when referring to either UIS or SNIS. Further, often exhibits asymptotic normality, so that as ,

(7)

where

. By a central limit theorem (CLT), asymptotic normality for

is straightforward, and is denoted by . For the SNIS estimator, when is finite, asymptotic normality holds with

(8)

The form of has been presented before in Nilakanta (2020); Owen (2013)

, but for the sake of completion, we provide the details in the Appendix. Under a finite second moment condition with

denoted by , the vanilla Monte Carlo estimator, , also satisfies a CLT with a limiting covariance matrix

(9)

For a univariate , in order to quantify the relative quality of as compared to , Kong (1992) attempts to define the ESS as the ratio of the respective mean-squared-errors. Since the bias of the SNIS estimator is difficult to ascertain, ESS is first described as (in the univariate case, both variances are scalars). Through a series of approximations, Kong (1992) obtains the following estimator of the above ESS:

(10)

where are the normalized weights. Elvira et al. (2018) provide a full derivation of (10), including the assumptions and approximations, and discuss various shortcomings of . A critical undesirable quality of is that it does not depend on the function of interest and thus claims the same estimation quality irrespective of . Consequently, is also unable to acknowledge multivariate estimation in . Finally, since , the estimator is unable to detect improvements in statistical efficiency, a key feature of IS.

We propose a multivariate extension of the definition of ESS based on the original intention of Kong (1992), but replace the finite-time variances with the estimable limiting variances. That is, we define the multivariate ESS as

(11)

The th root of determinant of covariance matrices is a dimension-free measure of variability (SenGupta, 1987)

and is natural since the generalized variability of a random vector is quantified by the determinant of the corresponding covariance matrix

(Wilks, 1932). Vats et al. (2019) employ a similar metric in the context of MCMC.

To avoid notational confusion, for the rest of the paper, we focus our attention on the SNIS estimator, presenting estimators of and . Since

(12)

a plug-in estimator of is

(13)

Similarly, for from (8), a multivariate extension of the plug-in estimator by Owen (2013) is

(14)
Remark 1.

For UIS, estimators for and , denoted by and , respectively, are:

(15)
(16)

The estimated M-ESS, , indicates the quality of estimation of relative to a vanilla Monte Carlo estimator. Moreover, as we discuss in next section, it also serves as basis the for a principled stopping criterion and can greatly alleviate ambiguity in terminating simulation.

3 Using ESS to Stop Simulation

A key practical question in any simulation paradigm is, when should sampling stop? Sequential stopping rules that check whether a desired criteria has been satisfied have been useful in answering this question in steady-state simulations (Dong and Glynn, 2019; Glynn and Whitt, 1992), stochastic programming (Bayraksan and Pierre-Louis, 2012), general Monte Carlo (Frey, 2010; Vats et al., 2021), and MCMC (Flegal and Gong, 2015; Vats et al., 2019). Such stopping rules terminate simulation when the size of the confidence region of the estimator is small.

Ellipsoidal large-sample confidence regions for are available due to (7) and (14). Let denote the

-quantile for a chi-squared distribution with

degrees of freedom. Then a large-sample confidence region around is,

(17)

The volume of the confidence region is,

(18)

For a user-chosen

-tolerance, we derive a stopping rule that terminates the simulation when the estimated ESS is larger than a pre-determined lower-bound. Following the relative standard deviation sequential stopping rule of

Vats et al. (2019), consider stopping the simulation when the th root of the volume of is an th fraction of . Specifically, for as , the process terminates the first time when

(19)

Here, is chosen to ensure that a user-chosen minimum simulation effort of is guaranteed so that the simulation does not end too early due to unstable early estimates. We will comment on a particular choice later. Ignoring for large , (18) implies that the stopping rule in (19) is equivalent to stopping the simulation when:

(20)

This reformulation furnishes a rule that terminates the IS process when is larger than . A lower level of relative tolerance, , yields a higher . Notice that the lower bound can be calculated even before the simulation begins. Implementations that yield large variability in the SNIS estimator will require more sample size to reach the desired lower bound. Such sequential stopping rules terminate simulation at a random time, and thus additional care must be taken to ensure asymptotic validity of the resulting confidence regions is retained. The following theorem is built on the works of Glynn and Whitt (1992); Vats et al. (2019). The theorem and proof (in the Appendix) is presented for the SNIS estimator; an analogous statement and proof is available for the UIS estimator.

Theorem 1.

Let be a strongly consistent estimator of an attribute of the system, . That is, let with probability 1 as . Define for some finite . Additionally, note that as with probability 1. For , consider the stopping rule

(21)

As , and .

Due to the random-time termination, Glynn and Whitt (1992) show that strong consistency of and is necessary for asymptotic validity of the resulting confidence regions. When , , making it so that simulation cannot stop. Thus, this choice of ensures termination occurs after a minimum simulation effort of . Setting and , is asymptotically equal to the smallest that satisfies (19). Thus, stopping the simulation by checking whether will produce adequate confidence regions. The relative precision, , may be chosen depending on the desired quality of estimation. As , , implying the required sample size would also increase to infinity.

4 Examples

We implement our proposed multivariate ESS in two different examples. First, we present a controlled scenario where the groundtruth is available, and therefore validation of the proposed methodology is possible. We estimate the mean vector of a multivariate normal distribution with a multivariate normal proposal using both UIS and SNIS. We arrive at an expression for the true

and use this to assess the proposed termination criterion. Next, we present a Bayesian multi-step step-stress model where the interest is in estimating two different functions . We demonstrate the utility of our in acknowledging these different estimation goals.

4.1 Multivariate Normal

Let where and is a positive-definite matrix and suppose is the identity function so that the goal is to estimate . Set the proposal to be . For and , we consider and of the following form:

The limiting variance of the SNIS estimator is available (details are in the Appendix):

(22)

Additionally, for the UIS estimator

(23)

Recall that most other diagnostics focus on the resulting weights, thereby not accounting for the differing quality of estimation from different estimators. We compare the true univariate and multivariate ESS for SNIS and UIS under three different settings:

  1. Setting 1: low target and proposal correlation, with . The and are

    (24)
  2. Setting 2: medium correlation in target and proposal, with . The and are

    (25)
  3. Setting 3: high and differing correlation in target and proposal, with and . The and are

    (26)

Throughout, we set . For , the covariance ellipse of and in the three settings, along with ellipses for and are shown in Figure 1 (top). It is evident that in all three settings the variance of UIS estimator is larger than that of the SNIS estimator. Figure 1 (bottom) shows the univariate and multivariate true ESS for UIS and SNIS vs the correlation between the two components in the proposal distribution; the chosen for each setting in indicated through a vertical dashed line. First, the SNIS estimator is clearly preferred here over the UIS estimator; this is unsurprising given the ellipses in the top row. Second, the differing quality of univariate and multivariate ESS estimation is apparent. Not accounting for the complex dependence structures (as presented in the top row), leads to inadequate understanding of the overall estimation quality.

Figure 1: (Top row) The target, proposal, UIS, and SNIS covariance ellipses for Settings 1, 2, 3 (left to right). (Bottom row) True ESS for SNIS and UIS from (22) and (23) for the fixed of each setting. The vertical dotted line marks the value of chosen in the three settings.

Next we demonstrate the performance of the proposed stopping rule in (20). Controlling the desired quality of estimation through , we study the impact of i) the problem dimension, , ii) the target/proposal setting, and iii) the choice of estimator (UIS and SNIS). Since dimensions will increase from , we present a general form of and , similar to the three previous settings:

(27)

where

We set and . Similar to before, for the three settings, we let 1) and 2) , and 3) . These three settings represent low, medium, and high correlation between components for all dimensions.

We implement our multivariate stopping rule in (20) in 100 repeated simulations for varying choices of for both the UIS and SNIS estimators. Figure 2 presents the norm of error in estimation of versus the sample size at -termination. That is, we calculate and plot it versus at termination for each . The top row has results for and the bottom row for . Here the circles represent the IS runs where is the UIS estimator and the triangles represent the SNIS estimator.

Figure 2: error of SNIS and UIS estimator vs termination time for varying for (top and bottom row). The truth is marked using the vertical dashed lines for both estimators.

First, we note that since UIS produces more inefficient estimators; for any given , UIS terminates later than SNIS. Additionally, as decreases, the variability in the squared error also decreases; this is a direct consequence of an increase in the required sample sizes. Finally, we note that the quality of estimation of the ESS over all three settings is sound. Setting 3, which has the most correlation in the target, exhibits the most variability in the termination time.

4.2 Weibull Multi-Step Step-Stress Model

Consider the fish dataset of Pal et al. (2021), described in Table 1, where the swimming performance of fish was investigated with an initial swimming rate of 15 cm/sec. The time at which a fish could not maintain its natural position was recorded as the failure time. The flow rate was increased by 5 cm/sec every time after 110, 130, and 150 minutes. Here, the increased flow rate can be thought of as a stress factor. Thus, there are four stress levels and the observed number of failures at each stress level, , , is , respectively. The failure times are centered by 80 and scaled by 100 as recommended by Pal et al. (2021), so that time-points corresponding to stress change are .

Stress level Failure times
83.50, 91.00, 91.00, 97.00, 107.00, 109.50
114.00, 115.41, 128.61
133.53, 138.58, 140.00
152.08, 155.10
Table 1: Fish Dataset

Let be the total number of failures observed during and before the th stress-level and let denote the time-to-failure of the th fish. Let the collected data of ordered observed failure times be denoted by . Pal et al. (2021) assume that the lifetime distribution of the experimental units under a given stress level , follows Weibull for

with probability density function

(28)

To allow for ordering in the time-to-failure with subsequent stress levels, the ’s are assumed to be ordered so that, . Let . For and , the following are the independent priors assumed by Pal et al. (2021):

(29)

where ODG stands for the ordered Dirichlet Gamma distribution. Given the Bayesian paradigm, the resulting posterior distribution is the primary object of interest and can be written down as

(30)

where and are densities of an ODG and Gamma distribution respectively; the exact parameters and details on are given in the Appendix. We set and .

Interest may be in the posterior mean or the average lifetime units for a stress level. For a Weibull distribution, the mean is , so two functions of interest are

(31)

Similar to Pal et al. (2021), we employ IS with proposal

(32)
Figure 3: Estimated pairwise correlations for (left) and (right).

Using this proposal, the weight function is . Since the normalizing constants in the posterior distribution are unknown, only the SNIS estimator can be implemented. First, in order to visualize the complex correlation structures of and , Figure 3 plots the estimated sample correlation matrices of the corresponding . Since significant correlation between components for both and is evident, multivariate assessment of the quality of estimation is warranted.

The disparity between ESS estimation using mutivariate ESS and univariate ESS with five different components of can be seen in left side plot of Figure 4. Here we present the estimated M-ESS for , univariate ESS for each component of , and . Since interest is in estimating all components of , each of the individual univariate ESSs only provide partial information about the quality of estimation, whereas M-ESS provides a complete understanding of the relative quality of estimation.

Figure 4: Left: ESS (Kong, univariate ESS, and M-ESS) vs sample size for where . Right: ESS (Kong and M-ESS) vs sample size for and . For each sample size, simulations are run for replications and the error bars for estimated ESS are drawn using two standard deviations.

Further, in Figure 4 (right), is limited to only quantifying the quality of the proposal distribution in relation to , and does not account for the different functions of interest. Further, it is evident that for any same sample size , the M-ESS for estimating is significantly smaller that the M-ESS for . This indicates requirement of increased simulation effort for estimating and enables users to make informed decisions motivated by their primary objective. For instance, running the simulations with , we find that the necessary sample size (averaged over replications) using (20) is for estimating and for estimating .

5 Discussion

We present a practical stopping criterion for IS that is based on self-assessing the quality of estimation in a multivariate setting. Note that choosing and adapting a proposal distribution is a rich and important area of work, known as adaptive importance sampling (AIS), and many existing algorithms are devoted to this task, e.g., PMC (Cappé et al., 2004), AMIS (Cornuet et al., 2012), M-PMC (Cappé et al., 2008), LAIS (Martino et al., 2015), DM-PMC (Elvira et al., 2017), or O-PMC (Elvira and Chouzenoux, 2021) (see Bugallo et al., 2017, for a detailed review). However, the existing methods do not consider an adaptive choice of the number of proposals nor the number of samples. We are confident that the proposed methodology can be useful to develop more efficient AIS algorithms.

We have also proposed an alternative diagnostic metric, with a similar structure as the traditional ESS. This classical ESS is traditionally defined as the relative variance of the vanilla Monte Carlo estimator and the chosen estimator (see Kong (1992) and the discussion in Elvira et al. (2018)). One can argue against the choice of the vanilla Monte Carlo estimator as the baseline; this may be seen in applications to rare event simulation (Owen et al., 2019), and therefore, the ESS would not be as informative in such scenarios. Our proposed stopping rule in (20) and Theorem 1 can be adapted to such changes in the baseline. While the standard metric of Kong (1992) does not translate into the number of effective samples from the target, it can still be useful in understanding the quality of the weights. We highlight that is useful in scenarios where a specific may not be of interest. Thus, can be used as a qualifier to ascertain the quality of the proposal distribution for a particular target through variance of the importance weights.

Other IS estimators have been proposed that promise improvement in estimation quality (see Martino et al. (2018); Elvira et al. (2019); Vehtari et al. (2015)) by utilising weight modification techniques. These methods exhibit variance reduction in theory, but there are no known methods of estimating their variance other than bootstrap techniques. Thus, variance estimation of other IS estimators is an interesting and critical line of future research.

Appendix A Appendix

a.1 Derivation of variance of SNIS estimator

Let us define the mapping such that

This allows us to define independent random variables

, for so that and let denote the covariance matrix of and we assume that ; this may not always be the case and care must be taken to ensure that the proposal distribution is such that the diagonal elements of are finite. Let denote the sample average of . By a standard central limit theorem,

Define the function as for . Then using the multivariate delta method (Lehmann, 2004) for variance of ratio of means, we can perform a second-degree approximation for as shown by Rice (2006), yielding

where

Note that translates the sample average to the SNIS estimator as

Therefore, as , we have

(33)

Thus,

a.2 Proof of Theorem 1

Proof.

For the purpose of this proof, we will extend the notation of to explicitly display the number of samples used in estimation. Let denote the estimators constructed using importance samples. First we show that as , . Consider . As , and , yielding .

Define and

Recall from (17) and the statement of Theorem 1

Using the fact that and is consistent, we have as . The following limit follows directly from Vats et al. (2019)

Using the above, a functional delta method, and a standard random time change argument (Billingsley, 2013, p. 144) we have,

As a consequence, as , with probability, . ∎

a.3 Ground Truth Calculations for Multivariate Normal example

Recall from Eq.(8), the limiting variance of is given by

In Section 4.1, . Plugging this in Eq.(8), we get

Next,

(34)
(35)

a.4 Details of Bayesian Multi-Step Step-Stress Model

The combined likelihood for the fish dataset in Section 4.2 with is,

(36)

where

(37)

Let and , then , , and from the full posterior distribution in Eq.(30) are

References

  • Bayraksan and Pierre-Louis (2012) Bayraksan, G. and Pierre-Louis, P. (2012). Fixed-width sequential stopping rules for a class of stochastic programs. SIAM Journal on Optimization, 22:1518–1548.
  • Billingsley (2013) Billingsley, P. (2013). Convergence of Probability Measures. John Wiley & Sons.
  • Bugallo et al. (2017) Bugallo, M. F., Elvira, V., Martino, L., Luengo, D., Miguez, J., and Djuric, P. M. (2017). Adaptive importance sampling: the past, the present, and the future. IEEE Signal Processing Magazine, 34(4):60–79.
  • Cappé et al. (2008) Cappé, O., Douc, R., Guillin, A., Marin, J. M., and Robert, C. P. (2008). Adaptive importance sampling in general mixture classes. Statistical Computing, 18:447–459.
  • Cappé et al. (2004) Cappé, O., Guillin, A., Marin, J. M., and Robert, C. P. (2004). Population Monte Carlo. Journal of Computational and Graphical Statistics, 13(4):907–929.
  • Chatterjee et al. (2018) Chatterjee, S., Diaconis, P., et al. (2018). The sample size required in importance sampling. The Annals of Applied Probability, 28(2):1099–1135.
  • Cornuet et al. (2012) Cornuet, J. M., Marin, J. M., Mira, A., and Robert, C. P. (2012). Adaptive multiple importance sampling. Scandinavian Journal of Statistics, 39(4):798–812.
  • Denny (2001) Denny, M. (2001). Introduction to importance sampling in rare-event simulations. European Journal of Physics, 22(4):403.
  • Dong and Glynn (2019) Dong, J. and Glynn, P. (2019). A new approach to sequential stopping for stochastic simulation. Preprint.
  • Ekvall and Jones (2014) Ekvall, K. O. and Jones, G. L. (2014). Markov chain Monte Carlo. Wiley StatsRef: Statistics Reference Online, pages 1–9.
  • Elvira and Chouzenoux (2021) Elvira, V. and Chouzenoux, E. (2021). Optimized Population Monte Carlo. Working paper or preprint.
  • Elvira and Martino (2021) Elvira, V. and Martino, L. (2021). Advances in importance sampling. Wiley StatsRef: Statistics Reference Online, arXiv:2102.05407.
  • Elvira et al. (2017) Elvira, V., Martino, L., Luengo, D., and Bugallo, M. F. (2017). Improving population Monte Carlo: Alternative weighting and resampling schemes. Signal Processing, 131:77–91.
  • Elvira et al. (2019) Elvira, V., Martino, L., Luengo, D., and Bugallo, M. F. (2019). Generalized multiple importance sampling. Statistical Science, 34(1):129–155.
  • Elvira et al. (2018) Elvira, V., Martino, L., and Robert, C. P. (2018). Rethinking the effective sample size. arXiv preprint arXiv:1809.04129.
  • Flegal and Gong (2015) Flegal, J. M. and Gong, L. (2015). Relative fixed-width stopping rules for Markov chain Monte Carlo simulations. Statistica Sinica, pages 655–675.
  • Frey (2010) Frey, J. (2010).

    Fixed-width sequential confidence intervals for a proportion.

    The American Statistician, 64(3):242–249.
  • Glynn and Whitt (1992) Glynn, P. W. and Whitt, W. (1992). The asymptotic validity of sequential stopping rules for stochastic simulations. Annals of Applied Probability, 2:180–198.
  • Kahn (1950a) Kahn, H. (1950a). Random sampling (Monte Carlo) techniques in neutron attenuation problems. i. Nucleonics (US) Ceased Publication, 6(See also NSA 3-990).
  • Kahn (1950b) Kahn, H. (1950b). Random sampling (Monte Carlo) techniques in neutron attenuation problems. ii. Nucleonics (US) Ceased Publication, 6(See also NSA 4-3795).
  • Kong (1992) Kong, A. (1992). A note on importance sampling using standardized weights. University of Chicago, Dept. of Statistics, Tech. Rep, 348.
  • Kuntz et al. (2021) Kuntz, J., Crucinio, F. R., and Johansen, A. M. (2021). Product-form estimators: exploiting independence to scale up Monte Carlo. arXiv preprint arXiv:2102.11575.
  • Lehmann (2004) Lehmann, E. L. (2004). Elements of Large-sample Theory. Springer Science & Business Media.
  • Martino et al. (2015) Martino, L., Elvira, V., Luengo, D., and Corander, J. (2015). Layered adaptive importance sampling. Statistics and Computing, 27(3):599–623.
  • Martino et al. (2018) Martino, L., Elvira, V., Mguez, J., Artés-Rodrguez, A., and Djurić, P. (2018). A comparison of clipping strategies for importance sampling. In 2018 IEEE Statistical Signal Processing Workshop (SSP), pages 558–562. IEEE.
  • Nilakanta (2020) Nilakanta, H. (2020). Output analysis of Monte Carlo methods with applications to networks and functional approximation. PhD thesis, University of Minnesota.
  • Owen (2013) Owen, A. B. (2013). Monte Carlo Theory, Methods and Examples.
  • Owen et al. (2019) Owen, A. B., Maximov, Y., and Chertkov, M. (2019). Importance sampling the union of rare events with an application to power systems analysis. Electronic Journal of Statistics, 13(1):231–254.
  • Pal et al. (2021) Pal, A., Mitra, S., and Kundu, D. (2021). Bayesian order-restricted inference of a Weibull multi-step step-stress model. Journal of Statistical Theory and Practice, 15(2):1–33.
  • Park and Haran (2018) Park, J. and Haran, M. (2018). Bayesian inference in the presence of intractable normalizing functions. Journal of the American Statistical Association, 113:1372–1390.
  • Rice (2006) Rice, J. A. (2006). Mathematical Statistics and Data Analysis. Nelson Education.
  • Robert and Casella (2013) Robert, C. and Casella, G. (2013). Monte Carlo Statistical Methods. Springer Science & Business Media.
  • Sanz-Alonso (2018) Sanz-Alonso, D. (2018). Importance sampling and necessary sample size: an information theory approach. SIAM/ASA Journal on Uncertainty Quantification, 6(2):867–879.
  • Sanz-Alonso and Wang (2021) Sanz-Alonso, D. and Wang, Z. (2021). Bayesian update with importance sampling: Required sample size. Entropy, 23(1):22.
  • SenGupta (1987) SenGupta, A. (1987). Tests for standardized generalized variances of multivariate normal populations of possibly different dimensions.

    Journal of Multivariate Analysis

    , 23:209–219.
  • Vats et al. (2019) Vats, D., Flegal, J. M., and Jones, G. L. (2019). Multivariate output analysis for Markov chain Monte Carlo. Biometrika, 106(2):321–337.
  • Vats et al. (2021) Vats, D., Flegal, J. M., and Jones, G. L. (2021). Monte Carlo Simulation: Are We There Yet?, pages 1–15. In Wiley StatsRef: Statistics Reference Online.
  • Vehtari et al. (2015) Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2015). Pareto smoothed importance sampling. arXiv preprint arXiv:1507.02646.
  • Wilks (1932) Wilks, S. S. (1932). Certain generalizations in the analysis of variance. Biometrika, pages 471–494.