Consistency of randomized integration methods

For integrable functions we provide a weak law of large numbers for structured Monte Carlo methods, such as estimators based on randomized digital nets, Latin hypercube sampling, randomized Frolov point sets as well as Cranley-Patterson rotations. Moreover, we suggest median modified methods and show that for integrands in L^p with p>1 a strong law of large numbers holds.

Authors

• 2 publications
• 14 publications
02/18/2020

A strong law of large numbers for scrambled net integration

This article provides a strong law of large numbers for integration on d...
11/19/2021

The Marcinkiewicz-Zygmund law of large numbers for exchangeable arrays

We show a Marcinkiewicz-Zygmund law of large numbers for jointly and dis...
03/05/2020

A strong law of large numbers for simultaneously testing parameters of Lancaster bivariate distributions

We prove a strong law of large numbers for simultaneously testing parame...
03/01/2022

Integration of bounded monotone functions: Revisiting the nonsequential case, with a focus on unbiased Monte Carlo (randomized) methods

In this article we revisit the problem of numerical integration for mono...
11/10/2020

Strong Law of Large Numbers for Functionals of Random Fields With Unboundedly Increasing Covariances

The paper proves the Strong Law of Large Numbers for integral functional...
12/18/2020

A Tool for Custom Construction of QMC and RQMC Point Sets

We present LatNet Builder, a software tool to find good parameters for l...
12/18/2020

On the distribution of scrambled (0,m,s)-nets over unanchored boxes

We introduce a new quality measure to assess randomized low-discrepancy ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In computational statistics and numerical analysis one of the major challenges is the development and investigation of estimators of expectations. A prototypical setting is the approximation of the integral

 I(f)=∫[0,1]df(x)dx (1)

w.r.t. the Lebesgue measure for an integrable function with

. Given a (sufficiently rich) probability space

111All random variables are assumed to be defined on this probability space., a measurable space , a function and a

-valued random variable

we can ask more generally for the computation of

 I(g)=Eg(Y). (2)

For this goal (structured) Monte Carlo methods have been constructed. For a Monte Carlo method is given by a real-valued random variable which typically depends on a -valued sequence of random variables and . Here indicates the number of function evaluations of that are allowed to be used. Classical examples of such methods (exploiting different properties of ) for the approximation of are for instance antithetic sampling, importance sampling or control variate based estimators, see e.g. [MGNR12, Owe13].

We focus on the prototypical setting of the approximation of from (1), where , and consider methods based on randomized -sequences [Owe95], Latin hypercube sampling [MBC79], randomized Frolov point sets [KN17] and Cranley-Patterson rotations [CP76].

A minimal requirement for a reasonable proxy is that for sufficiently large and increasing it gets ‘close’ to . We ask for the sequence of random variables that it at least converges in probability to , i.e., for all we have

 limn→∞P[|Snf−I(f)|>ε]=0. (3)

A stronger desirable property is almost sure convergence, i.e.,

 P[limn→∞Snf=I(f)]=1. (4)

For the classical Monte Carlo estimator , based on an iid sequence of -distributed random variables, (3) is known as weak law of large numbers (WLLN) and (4) as strong law of large numbers (SLLN). Motivated by this fact, we also say a (general) Monte Carlo method satisfies such a law of large numbers if the corresponding limit property ((3) or (4)) holds.

Depending on integrability properties of the integrand we provide conditions for general methods to satisfy a SLLN or WLLN. For we define to be the set of all functions with finite . We show that for all aforementioned estimators of (1) a WLLN holds whenever . Furthermore, we argue for with how to get a SLLN for median modified methods. In the latter scenario we follow the approach of [OR21].

Now let us discuss how these results fit into the literature and whether they can be improved. Define the expected absolute and probabilistic error of an arbitrary Monte Carlo method for with as and respectively. Then, by virtue of [KR19, Theorem 2.3] there are constants , and with the following properties: For any and any arbitrary Monte Carlo method we can find a function satisfying , such that for holds

 P[|Snf−I(f)|>ε]≥c. (5)

As a consequence, see also [Hei94, Nov88], for the expected absolute error and we obtain

 E[|Snf−I(f)|]≥c⋅ε02. (6)

By (5), a WLLN cannot be improved to a non-trivial bound of the probabilistic error uniformly for all with . Due to (6) we also have that no method has a decreasing expected absolute error uniformly for this class of integrands. But it is well known that the classical Monte Carlo estimator satisfies a SLLN for and that a SLLN implies a WLLN. Therefore, for a given method for the approximation of with we can ask for a SLLN, even when the uniform probabilistic and expected absolute error does not decrease.

Motivated by Bayesian optimization [BKJ19] the recent work [OR21] provides a SLLN for digital nets randomized by a nested uniform scrambling for with

. The arguments there are based on the interpolation theorem of Riesz-Thorin as well as a subsequence technique already appearing in

[Ete81]. We consider a median of general estimators and show a SLLN without using the subsequence technique. However, we also require for the integrand with . In contrast to that our generic results about the WLLN, see Theorem 2.1 and Corollary 2.2, require only . We apply these to different settings including randomized digital nets and Latin hypercube sampling, therefore extending the WLLNs of [Loh96, OR21].

Now we briefly outline the structure. Section 2 is devoted to WLLNs. In particular Theorem 2.1 and Corollary 2.2 establish WLLNs for under further mild conditions. We discuss estimators based on samples from randomized -sequences, Latin Hypercube Sampling and the randomized Frolov algorithm, for which this result is applicable. Moreover, we consider Cranley-Patterson rotations for which it is also possible to derive a WLLN. In Section 3 we describe how the median of independent realizations improves the probabilistic error and can be used to derive an SLLN for with .

2 Weak laws of large numbers

The following two results are our main tools for establishing WLLNs for Monte Carlo methods. Both are applicable to a broad class of algorithms and the underlying requirements are easy to verify.

Theorem 2.1.

Let be some (randomized) estimator for , where , and assume that is linear222 is linear if for and . and unbiased333 is unbiased if for any .. If the WLLN holds for some dense subset (w.r.t. ), then the WLLN also holds for .

Proof.

Let and be arbitrary. Choose such that . We observe that

 P[|Snf−I(f)|>ε]≤P[|Snf−Snϕ|>ε3]+P[|Snϕ−I(ϕ)|>ε3] +P[|I(ϕ)−I(f)|>ε3].

Applying Markov’s inequality, linearity and unbiasedness of we deduce . Similarly, . By assumption we have a WLLN for , hence there is some such that for any holds . In conclusion

 P[|Snf−I(f)|>ε]≤δ,

for any . Since was arbitrary the result follows. ∎

Corollary 2.2.

Let be some (randomized) estimator for , where , and assume that is linear and unbiased. If the WLLN holds for functions, for some , then the WLLN holds also for .

Proof.

Denote by the set of all functions that are infinitely often differentiable and whose support is a compact subset of . We have that for any , and it is a well known fact that is dense in w.r.t. . (In fact one has that is dense in . However, the open and closed unit cube only differ by a set of measure zero, so the density result easily extends to .)

By assumption we have a WLLN for functions, and consequently for . Thus the result follows by an application of Theorem 2.1. ∎

Remark 2.3.

In the proofs of Theorem 2.1 and Corollary 2.2 we restricted ourselves to , however, this assumption can be relaxed to obtain WLLNs for estimating as in (2). Furthermore, we did not require that the amount of function evaluations allowed to use by is (exactly) .

We now turn to examples. First, we consider an estimator of the form

 Sn(f)=1n∑y∈Pnf(y), (7)

where is a random point set, i.e., denotes a -valued random variable for any . Obviously, is linear and whenever

consists of uniformly distributed random variables in

it is also unbiased.

Example 2.4.

We consider randomized -sequences as in [Owe95]. The underlying deterministic points are -sequences, which also prove to be useful for numerical integration (cf. [DP10]), and randomization is done, roughly speaking, by randomly permuting digits. One obtains a sequence of random points, denoted by , which consists of uniformly distributed points and satisfies crucial properties of a -sequence with probability one, see [Owe95]. Setting and defining as specified in (7

) we indeed have a linear and unbiased estimator.

For the method satisfies a SLLN, as recently shown in [OR21], and consequently also a WLLN for . Then, by an application of Corollary 2.2 the WLLN holds for , too.

Example 2.5.

We study Latin Hypercube Sampling [MBC79]. The corresponding algorithm produces uniformly distributed points satisfying stratification properties without requiring an infeasible amount of computational resources. We refer to the book [Owe13] for a more detailed introduction.

Using those points for within of (7) yields a linear and unbiased estimator. In [Loh96] a SLLN for functions is established, which implies a WLLN for . Making use of Corollary 2.2 the WLLN generalizes to . We remark that consistency of Latin Hypercube Sampling was already observed in [AHT12, PS08], however, a proof is not given there and seems tricky to find in the literature.

Example 2.6.

A Cranley-Patterson rotation, cf. [CP76], randomizes deterministic points to by setting (coordinate-wise) for , where . The ’s are again uniformly distributed in , hence as given in (7) is linear and unbiased. We refer to [Owe13, Chapter 17.3] for a more detailed discussion about this method.

Employing [DP10, Proposition 2.18], which is a version of the Koksma-Hlawka inequality, and then [Owe13, Inequality (17.9)] it follows that for with probability one

 |Snϕ−I(ϕ)|≤c1⋅D({X(n)1,…,X(n)n})≤c2⋅D({a(n)1,…,a(n)n}).

Here are finite constants depending on only. Moreover, is as in the proof of Corollary 2.2 and denotes the discrepancy (of a point set), see for instance the book [DP10] for details.

Assume now as , then we immediately obtain a WLLN for functions in and by virtue of Theorem 2.1 a WLLN for follows.

Under rather mild conditions, regarding the construction of a Hammersley point set or a lattice point set or a -sequence the discrepancy for any of these point sets decreases to zero. However, it might be necessary to switch to a subsequence, cf. [DP10].

We add an example where does not take the form of (7).

Example 2.7.

Randomized Frolov points, as studied in [KN17] and later in [Ull17], rely on a random shift and dilatation of (deterministic) Frolov points (cf. [Ull16] for an introductory paper), which itself provide a powerful tool for numerical integration in Sobolev spaces, see [UU16] for a survey.

For technical reasons we extend any to a function defined on by setting it zero outside the unit cube. Hence for the integral of interest holds .

Here for any the estimator is given by

 Snf=1|det(An)|∑y∈Pnf(y),

where

is a suitable random matrix and

is the set of corresponding randomized Frolov points, which in particular depend on . We emphasize that

 E|detAn|=E[card(Pn)]=n,

where denotes the cardinality of . For details we refer to [Ull17, Section 2.1]. Obviously one has a linear estimator and [KN17, Lemma 3] guarantees that is also unbiased.

For , extended as above, we employ [Ull17, Theorem 1.1] and Plancherel’s identity (see e.g. [Gra14, Theorem 2.2.14]) to deduce that

 E[|Snf−I(f)|2]≤c⋅∥f∥22n, (8)

where does not depend on and . This implies a WLLN for functions and hence, by Corollary 2.2, a WLLN for follows. Note that in [Ull17] the definition and analysis of this method is done for more general domains than . Therefore taking Remark 2.3 into account leads also to WLLNs for more general domains.

3 Median modification

In this section we derive SLLNs for by using median modified methods and combining techniques which were also used in [KR19], [OR21] and [KNR19]. Let be any linear and unbiased Monte Carlo method,

an odd number and

, for some . We write for the median of independent copies of , denoted by . Employing results from [NP09] it was shown in [KR19] that we have the following:

Lemma 3.1.

Let the Monte Carlo method satisfy

 P[|Snf−I(f)|>ε]≤α<12,

for some . Then,

 P[|Sn,kf−I(f)|>ε]≤αk/22k.
Remark 3.2.

In [KR19] Lemma 3.1 is proven in the case where uses exactly function evaluations. However, reviewing the proof of [NP09, Proposition 2.1] it follows that this condition can be relaxed and that Lemma 3.1 is also applicable in more general situations, e.g., for randomized Frolov points as in Example 2.7.

Remark 3.3.

Let us also mention that recently median modified randomized quasi-Monte Carlo methods, see [PO21] and [GL22], attracted considerable attention.

To deduce error bounds for we make use of the following result, which is a version of the Riesz-Thorin interpolation theorem, see [Gra14, Theorem 1.3.4].

Theorem 3.4 (Riesz-Thorin).

Let and be measure spaces. Let be a linear operator from to and also a linear operator from to , for some . Assume that

 ∥T∥Lp1(A1)→Lp1(A2)≤M1and∥T∥Lp2(A1)→Lp2(A2)≤M2.

For define via . Then, is also a linear operator from to satisfying

 ∥T∥Lp(A1)→Lp(A2)≤M1−θ1Mθ2.

3.1 SLLNs via median modification

Throughout this section we assume that

is a linear and unbiased Monte Carlo method satisfying the following variance estimate

 Var[Snf]≤c⋅∥f∥22n, (9)

for an absolute constant and any .

We set and , where denotes the -dimensional Lebesgue measure. Moreover, let , and for some . Then, for the following operators are well defined, , where . An application of Theorem 3.4 yields

 ∥Tn∥p=∥Tn∥Lp(A1)→Lp(A2)≤21−θcθ/2nθ/2=cpn1−1p

with . Here we used, with the same notation as in Theorem 3.4, that since is unbiased, and in view of (9). Therefore, by Markov’s inequality,

 P[|Snf−I(f)|>ε]≤1εp∥Tn∥pp≤(cpε)p1np−1. (10)

Since the right hand side in (10) decreases to , so there is some , such that for any Lemma 3.1 is applicable. Thus for ,

 P[|Sn,kf−I(f)|>ε]≤2k(cpε)pk1n(p−1)k/2,

where is arbitrary. If additionally it follows immediately from the lemma of Borel-Cantelli that a SLLN holds.

Remark 3.5.

Note that without using the median of several independent copies of it is still possible to find SLLNs by considering for a suitably chosen (sub)sequence . By the lemma of Borel-Cantelli and (10) a sufficient condition is .

Example 3.6.

We again consider with based on Latin Hypercube Sampling as in Example 2.5. In this example the desired estimate of (9) is established, for example, in [Owe13, Corollary 17.3]. Thus, the median for odd or a suitable subsequence of , for instance as suggested in Remark 3.5, satisfies a SLLN for .

Example 3.7.

We continue with from Example 2.7 based on randomized Frolov points. By (8) we deduce a variance estimate as required in (9), resulting in a SLLN for , for the median of independent copies ( odd), or for by passing to a suitable subsequence (see Remark 3.5).

Remark 3.8.

The techniques used in this section can be adapted to the case where instead of (9) one has an estimate for the mean error , for some .

Acknowledgements

Julian Hofstadler gratefully acknowledges support of the DFG within project 432680300 – SFB 1456 subproject B02. The authors thank David Krieg, Art Owen, Erich Novak and Mario Ullrich for valuable comments.

References

• [AHT12] C. Aistleitner, M. Hofer, and R. Tichy.

A central limit theorem for Latin Hypercube Sampling with dependence and application to exotic basket option pricing.

International Journal of Theoretical and Applied Finance, 15(07):1250046, 2012.
• [BKJ19] M. Balandat, B. Karrer, D. R. Jiang, S. Daulton, B. Letham, A. G. Wilson, and E. Bakshy.

BoTorch: Bayesian optimization in PyTorch.

In Advances in Neural Information Processing Systems 33, 2019.
• [CP76] R. Cranley and T. N. L. Patterson. Randomization of Number Theoretic Methods for Multiple Integration. SIAM Journal on Numerical Analysis, 13(6):904–914, 1976.
• [DP10] J. Dick and F. Pillichshammer. Digital Nets and Sequences: Discrepancy Theory and Quasi–Monte Carlo Integration. Cambridge University Press, 2010.
• [Ete81] N. Etemadi. An elementary proof of the strong law of large numbers. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 55(1):119–122, 1981.
• [GL22] T. Goda and P. L’Ecuyer. Construction-free median quasi-Monte Carlo rules for function spaces with unspecified smoothness and general weights. arXiv preprint arXiv:2201.09413, 2022.
• [Gra14] L. Grafakos. Classical Fourier Analysis, volume 249 of Graduate Texts in Mathematics. Springer, third edition, 2014.
• [Hei94] S. Heinrich. Random approximation in numerical analysis. Proceedings of the Conference “Functional Analysis” Essen, pages 123–171, 1994.
• [KN17] D. Krieg and E. Novak. A universal algorithm for multivariate integration. Foundations of Computational Mathematics, 17(4):895–916, 2017.
• [KNR19] R. J. Kunsch, E. Novak, and D. Rudolf. Solvable integration problems and optimal sample size selection. Journal of Complexity, 53:40–67, 2019.
• [KR19] R. J. Kunsch and D. Rudolf. Optimal confidence for Monte Carlo integration of smooth functions. Advances in Computational Mathematics, 45(5):3095–3122, 2019.
• [Loh96] W.-L. Loh. On Latin Hypercube Sampling. The Annals of Statistics, 24(5):2058–2080, 1996.
• [MBC79] M. D. Mckay, R. J. Beckman, and W. J. Conover. A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code. Technometrics, 21(2):239–245, 1979.
• [MGNR12] T. Müller-Gronbach, E. Novak, and K. Ritter. Monte Carlo-Algorithmen. Springer-Verlag, 2012.
• [Nov88] E. Novak. Deterministic and stochastic error bounds in numerical analysis, Lecture notes in Mathematics 1349. Springer, 1988.
• [NP09] W. Niemiro and P. Pokarowski. Fixed precision MCMC estimation by median of products of averages. Journal of Applied Probability, 46(2):309–329, 2009.
• [OR21] A. B. Owen and D. Rudolf. A Strong Law of Large Numbers for Scrambled Net Integration. SIAM Review, 63(2):360–372, 2021.
• [Owe95] A. B. Owen. Randomly permuted -nets and -sequences. In H. Niederreiter and P. J.-S. Shiue, editors, Monte Carlo and quasi-Monte Carlo Methods in Scientific Computing, pages 299–317. Springer, 1995.
• [Owe13] A. B. Owen. Monte Carlo theory, methods and examples. 2013.
• [PO21] Z. Pan and A. B. Owen. Super-polynomial accuracy of one dimensional randomized nets using the median-of-means. arXiv preprint arXiv:2111.12676, 2021.
• [PS08] N. Packham and W. Schmidt. Latin Hypercube Sampling with dependence and applications in finance. Journal of Computational Finance, 13(3):81–111, 2010.
• [Ull16] M. Ullrich. On “Upper Error Bounds for Quadrature Formulas on Function Classes” by K.K. Frolov. In R. Cools, D. Nuyens editors, Monte Carlo and Quasi-Monte Carlo Methods, pages 571–582. Springer, 2016.
• [Ull17] M. Ullrich. A Monte Carlo Method for Integration of Multivariate Smooth Functions. SIAM Journal on Numerical Analysis, 55(3):1188–1200, 2017.
• [UU16] M. Ullrich and T. Ullrich. The Role of Frolov’s Cubature Formula for Functions with Bounded Mixed Derivative. SIAM Journal on Numerical Analysis, 54(2):969–993, 2016.