In computational statistics and numerical analysis one of the major challenges is the development and investigation of estimators of expectations. A prototypical setting is the approximation of the integral
w.r.t. the Lebesgue measure for an integrable function with
. Given a (sufficiently rich) probability space111All random variables are assumed to be defined on this probability space., a measurable space , a function and a
-valued random variablewe can ask more generally for the computation of
For this goal (structured) Monte Carlo methods have been constructed. For a Monte Carlo method is given by a real-valued random variable which typically depends on a -valued sequence of random variables and . Here indicates the number of function evaluations of that are allowed to be used. Classical examples of such methods (exploiting different properties of ) for the approximation of are for instance antithetic sampling, importance sampling or control variate based estimators, see e.g. [MGNR12, Owe13].
We focus on the prototypical setting of the approximation of from (1), where , and consider methods based on randomized -sequences [Owe95], Latin hypercube sampling [MBC79], randomized Frolov point sets [KN17] and Cranley-Patterson rotations [CP76].
A minimal requirement for a reasonable proxy is that for sufficiently large and increasing it gets ‘close’ to . We ask for the sequence of random variables that it at least converges in probability to , i.e., for all we have
A stronger desirable property is almost sure convergence, i.e.,
For the classical Monte Carlo estimator , based on an iid sequence of -distributed random variables, (3) is known as weak law of large numbers (WLLN) and (4) as strong law of large numbers (SLLN). Motivated by this fact, we also say a (general) Monte Carlo method satisfies such a law of large numbers if the corresponding limit property ((3) or (4)) holds.
Depending on integrability properties of the integrand we provide conditions for general methods to satisfy a SLLN or WLLN. For we define to be the set of all functions with finite . We show that for all aforementioned estimators of (1) a WLLN holds whenever . Furthermore, we argue for with how to get a SLLN for median modified methods. In the latter scenario we follow the approach of [OR21].
Now let us discuss how these results fit into the literature and whether they can be improved. Define the expected absolute and probabilistic error of an arbitrary Monte Carlo method for with as and respectively. Then, by virtue of [KR19, Theorem 2.3] there are constants , and with the following properties: For any and any arbitrary Monte Carlo method we can find a function satisfying , such that for holds
By (5), a WLLN cannot be improved to a non-trivial bound of the probabilistic error uniformly for all with . Due to (6) we also have that no method has a decreasing expected absolute error uniformly for this class of integrands. But it is well known that the classical Monte Carlo estimator satisfies a SLLN for and that a SLLN implies a WLLN. Therefore, for a given method for the approximation of with we can ask for a SLLN, even when the uniform probabilistic and expected absolute error does not decrease.
. The arguments there are based on the interpolation theorem of Riesz-Thorin as well as a subsequence technique already appearing in[Ete81]. We consider a median of general estimators and show a SLLN without using the subsequence technique. However, we also require for the integrand with . In contrast to that our generic results about the WLLN, see Theorem 2.1 and Corollary 2.2, require only . We apply these to different settings including randomized digital nets and Latin hypercube sampling, therefore extending the WLLNs of [Loh96, OR21].
Now we briefly outline the structure. Section 2 is devoted to WLLNs. In particular Theorem 2.1 and Corollary 2.2 establish WLLNs for under further mild conditions. We discuss estimators based on samples from randomized -sequences, Latin Hypercube Sampling and the randomized Frolov algorithm, for which this result is applicable. Moreover, we consider Cranley-Patterson rotations for which it is also possible to derive a WLLN. In Section 3 we describe how the median of independent realizations improves the probabilistic error and can be used to derive an SLLN for with .
2 Weak laws of large numbers
The following two results are our main tools for establishing WLLNs for Monte Carlo methods. Both are applicable to a broad class of algorithms and the underlying requirements are easy to verify.
Let be some (randomized) estimator for , where , and assume that is linear222 is linear if for and . and unbiased333 is unbiased if for any .. If the WLLN holds for some dense subset (w.r.t. ), then the WLLN also holds for .
Let and be arbitrary. Choose such that . We observe that
Applying Markov’s inequality, linearity and unbiasedness of we deduce . Similarly, . By assumption we have a WLLN for , hence there is some such that for any holds . In conclusion
for any . Since was arbitrary the result follows. ∎
Let be some (randomized) estimator for , where , and assume that is linear and unbiased. If the WLLN holds for functions, for some , then the WLLN holds also for .
Denote by the set of all functions that are infinitely often differentiable and whose support is a compact subset of . We have that for any , and it is a well known fact that is dense in w.r.t. . (In fact one has that is dense in . However, the open and closed unit cube only differ by a set of measure zero, so the density result easily extends to .)
By assumption we have a WLLN for functions, and consequently for . Thus the result follows by an application of Theorem 2.1. ∎
We now turn to examples. First, we consider an estimator of the form
where is a random point set, i.e., denotes a -valued random variable for any . Obviously, is linear and whenever
consists of uniformly distributed random variables init is also unbiased.
We consider randomized -sequences as in [Owe95]. The underlying deterministic points are -sequences, which also prove to be useful for numerical integration (cf. [DP10]), and randomization is done, roughly speaking, by randomly permuting digits. One obtains a sequence of random points, denoted by , which consists of uniformly distributed points and satisfies crucial properties of a -sequence with probability one, see [Owe95]. Setting and defining as specified in (7
) we indeed have a linear and unbiased estimator.
We study Latin Hypercube Sampling [MBC79]. The corresponding algorithm produces uniformly distributed points satisfying stratification properties without requiring an infeasible amount of computational resources. We refer to the book [Owe13] for a more detailed introduction.
Using those points for within of (7) yields a linear and unbiased estimator. In [Loh96] a SLLN for functions is established, which implies a WLLN for . Making use of Corollary 2.2 the WLLN generalizes to . We remark that consistency of Latin Hypercube Sampling was already observed in [AHT12, PS08], however, a proof is not given there and seems tricky to find in the literature.
A Cranley-Patterson rotation, cf. [CP76], randomizes deterministic points to by setting (coordinate-wise) for , where . The ’s are again uniformly distributed in , hence as given in (7) is linear and unbiased. We refer to [Owe13, Chapter 17.3] for a more detailed discussion about this method.
Assume now as , then we immediately obtain a WLLN for functions in and by virtue of Theorem 2.1 a WLLN for follows.
Under rather mild conditions, regarding the construction of a Hammersley point set or a lattice point set or a -sequence the discrepancy for any of these point sets decreases to zero. However, it might be necessary to switch to a subsequence, cf. [DP10].
We add an example where does not take the form of (7).
Randomized Frolov points, as studied in [KN17] and later in [Ull17], rely on a random shift and dilatation of (deterministic) Frolov points (cf. [Ull16] for an introductory paper), which itself provide a powerful tool for numerical integration in Sobolev spaces, see [UU16] for a survey.
For technical reasons we extend any to a function defined on by setting it zero outside the unit cube. Hence for the integral of interest holds .
Here for any the estimator is given by
is a suitable random matrix andis the set of corresponding randomized Frolov points, which in particular depend on . We emphasize that
where does not depend on and . This implies a WLLN for functions and hence, by Corollary 2.2, a WLLN for follows. Note that in [Ull17] the definition and analysis of this method is done for more general domains than . Therefore taking Remark 2.3 into account leads also to WLLNs for more general domains.
3 Median modification
an odd number and, for some . We write for the median of independent copies of , denoted by . Employing results from [NP09] it was shown in [KR19] that we have the following:
Let the Monte Carlo method satisfy
for some . Then,
In [KR19] Lemma 3.1 is proven in the case where uses exactly function evaluations. However, reviewing the proof of [NP09, Proposition 2.1] it follows that this condition can be relaxed and that Lemma 3.1 is also applicable in more general situations, e.g., for randomized Frolov points as in Example 2.7.
To deduce error bounds for we make use of the following result, which is a version of the Riesz-Thorin interpolation theorem, see [Gra14, Theorem 1.3.4].
Theorem 3.4 (Riesz-Thorin).
Let and be measure spaces. Let be a linear operator from to and also a linear operator from to , for some . Assume that
For define via . Then, is also a linear operator from to satisfying
3.1 SLLNs via median modification
Throughout this section we assume that
is a linear and unbiased Monte Carlo method satisfying the following variance estimate
for an absolute constant and any .
We set and , where denotes the -dimensional Lebesgue measure. Moreover, let , and for some . Then, for the following operators are well defined, , where . An application of Theorem 3.4 yields
where is arbitrary. If additionally it follows immediately from the lemma of Borel-Cantelli that a SLLN holds.
Note that without using the median of several independent copies of it is still possible to find SLLNs by considering for a suitably chosen (sub)sequence . By the lemma of Borel-Cantelli and (10) a sufficient condition is .
The techniques used in this section can be adapted to the case where instead of (9) one has an estimate for the mean error , for some .
Julian Hofstadler gratefully acknowledges support of the DFG within project 432680300 – SFB 1456 subproject B02. The authors thank David Krieg, Art Owen, Erich Novak and Mario Ullrich for valuable comments.
C. Aistleitner, M. Hofer, and R. Tichy.
A central limit theorem for Latin Hypercube Sampling with dependence and application to exotic basket option pricing.International Journal of Theoretical and Applied Finance, 15(07):1250046, 2012.
M. Balandat, B. Karrer, D. R. Jiang, S. Daulton, B. Letham, A. G. Wilson, and
BoTorch: Bayesian optimization in PyTorch.In Advances in Neural Information Processing Systems 33, 2019.
- [CP76] R. Cranley and T. N. L. Patterson. Randomization of Number Theoretic Methods for Multiple Integration. SIAM Journal on Numerical Analysis, 13(6):904–914, 1976.
- [DP10] J. Dick and F. Pillichshammer. Digital Nets and Sequences: Discrepancy Theory and Quasi–Monte Carlo Integration. Cambridge University Press, 2010.
- [Ete81] N. Etemadi. An elementary proof of the strong law of large numbers. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 55(1):119–122, 1981.
- [GL22] T. Goda and P. L’Ecuyer. Construction-free median quasi-Monte Carlo rules for function spaces with unspecified smoothness and general weights. arXiv preprint arXiv:2201.09413, 2022.
- [Gra14] L. Grafakos. Classical Fourier Analysis, volume 249 of Graduate Texts in Mathematics. Springer, third edition, 2014.
- [Hei94] S. Heinrich. Random approximation in numerical analysis. Proceedings of the Conference “Functional Analysis” Essen, pages 123–171, 1994.
- [KN17] D. Krieg and E. Novak. A universal algorithm for multivariate integration. Foundations of Computational Mathematics, 17(4):895–916, 2017.
- [KNR19] R. J. Kunsch, E. Novak, and D. Rudolf. Solvable integration problems and optimal sample size selection. Journal of Complexity, 53:40–67, 2019.
- [KR19] R. J. Kunsch and D. Rudolf. Optimal confidence for Monte Carlo integration of smooth functions. Advances in Computational Mathematics, 45(5):3095–3122, 2019.
- [Loh96] W.-L. Loh. On Latin Hypercube Sampling. The Annals of Statistics, 24(5):2058–2080, 1996.
- [MBC79] M. D. Mckay, R. J. Beckman, and W. J. Conover. A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code. Technometrics, 21(2):239–245, 1979.
- [MGNR12] T. Müller-Gronbach, E. Novak, and K. Ritter. Monte Carlo-Algorithmen. Springer-Verlag, 2012.
- [Nov88] E. Novak. Deterministic and stochastic error bounds in numerical analysis, Lecture notes in Mathematics 1349. Springer, 1988.
- [NP09] W. Niemiro and P. Pokarowski. Fixed precision MCMC estimation by median of products of averages. Journal of Applied Probability, 46(2):309–329, 2009.
- [OR21] A. B. Owen and D. Rudolf. A Strong Law of Large Numbers for Scrambled Net Integration. SIAM Review, 63(2):360–372, 2021.
- [Owe95] A. B. Owen. Randomly permuted -nets and -sequences. In H. Niederreiter and P. J.-S. Shiue, editors, Monte Carlo and quasi-Monte Carlo Methods in Scientific Computing, pages 299–317. Springer, 1995.
- [Owe13] A. B. Owen. Monte Carlo theory, methods and examples. 2013.
- [PO21] Z. Pan and A. B. Owen. Super-polynomial accuracy of one dimensional randomized nets using the median-of-means. arXiv preprint arXiv:2111.12676, 2021.
- [PS08] N. Packham and W. Schmidt. Latin Hypercube Sampling with dependence and applications in finance. Journal of Computational Finance, 13(3):81–111, 2010.
- [Ull16] M. Ullrich. On “Upper Error Bounds for Quadrature Formulas on Function Classes” by K.K. Frolov. In R. Cools, D. Nuyens editors, Monte Carlo and Quasi-Monte Carlo Methods, pages 571–582. Springer, 2016.
- [Ull17] M. Ullrich. A Monte Carlo Method for Integration of Multivariate Smooth Functions. SIAM Journal on Numerical Analysis, 55(3):1188–1200, 2017.
- [UU16] M. Ullrich and T. Ullrich. The Role of Frolov’s Cubature Formula for Functions with Bounded Mixed Derivative. SIAM Journal on Numerical Analysis, 54(2):969–993, 2016.