# Integration of bounded monotone functions: Revisiting the nonsequential case, with a focus on unbiased Monte Carlo (randomized) methods

In this article we revisit the problem of numerical integration for monotone bounded functions, with a focus on the class of nonsequential Monte Carlo methods. We first provide new a lower bound on the maximal L^p error of nonsequential algorithms, improving upon a theorem of Novak when p > 1. Then we concentrate on the case p = 2 and study the maximal error of two unbiased methods-namely, a method based on the control variate technique, and the stratified sampling method.

• 2 publications
• 14 publications
• 12 publications
01/16/2020

### On Quasi-Monte Carlo Methods in Weighted ANOVA Spaces

In the present paper we study quasi-Monte Carlo rules for approximating ...
02/28/2020

### Monte Carlo integration of non-differentiable functions on [0,1]^ι, ι=1,...,d, using a single determinantal point pattern defined on [0,1]^d

This paper concerns the use of a particular class of determinantal point...
02/08/2020

### An improvement of Koksma's inequality and convergence rates for the quasi-Monte Carlo method

When applying the quasi-Monte Carlo (QMC) method of numerical integratio...
09/04/2020

### A Generalization of Spatial Monte Carlo Integration

Spatial Monte Carlo integration (SMCI) is an extension of standard Monte...
03/31/2022

### Consistency of randomized integration methods

For integrable functions we provide a weak law of large numbers for stru...
01/18/2019

### Unreasonable effectiveness of Monte Carlo

This is a comment on the article "Probabilistic Integration: A Role in S...
12/04/2020

### Perceptual error optimization for Monte Carlo rendering

Realistic image synthesis involves computing high-dimensional light tran...

## 1 Introduction

We address in this article the problem of constructing a numerical approximation of the expectation , where

is a real random variable with known distribution

, and  is a real function that is bounded and monotone. Such a problem occurs naturally in applications where one is interested in computing a risk using a model that provides an increasing conditional risk  with respect to some random variable . This situation occurs for instance in the field of food safety, with a dose of pathogen and

the corresponding probability of food-borne illness

(see, e.g., Perrin et al., 2014).

Assuming that the cumulative distribution function of

is continuous, the problem reduces after change of variable and scaling to the computation of , where and  belongs to the class  of all non-decreasing functions defined on  and taking values in . We work in this article in a fixed sample-size setting, where the number of evaluations of  to be performed is chosen beforehand.

This problem was first studied by Kiefer (1957), who proved that considering regularly-spaced evaluations at , , and then using the trapezoidal integration rule assuming  and , is optimal in the worst-case sense among all deterministic—possibly sequential—methods. The corresponding value of the maximal error is . Novak (1992) later studied Monte Carlo (a.k.a. randomized) methods, and established that sequential methods are better in this setting than nonsequential ones, with a minimax rate of  over  for the  error.

This article revisits the nonsequential setting with a focus on unbiased Monte Carlo methods, which are a key building block for the construction of good (rate-optimal) sequential methods—as can be learned from the proof of Theorem 3 in Novak’s article. Section 2 derives a lower bound for the maximal -error of nonsequential methods, for any , which is a generalization of a result by Novak (1992) concerning the  error. Sections 3 and 4 then study the maximal

error (variance) of two simple unbiased methods, based respectively on the control variate technique and on stratification. Section

5 concludes the article with a discussion.

## 2 A lower bound for the maximal Lp error

A nonsequential (also called non-adaptive) Monte Carlo method first evaluates the function at  random points , …, in , and then approximates the integral

using an estimator

 ^Sn(f)=φ(f(X1),f(X2),…,f(Xn)), (1)

where is a measurable function. A nonsequential method is thus defined by two ingredients: the distribution of  and the function . The worst-case  error of such a method over the class  is

 ep(^Sn)=supf∈FE(∣∣S(f)−^Sn(f)∣∣p)1/p. (2)
###### Remark 1.

The class of nonsequential Monte Carlo methods as usually defined in the literature also allows  to be randomized (i.e., allows

to be a random function). We have not considered randomized estimators in our definition, however, since Rao-Blackwell’s theorem implies that they do not help in this setting, for any convex loss function.

Novak (1992) proved that for any nonsequential Monte Carlo method with sample size , the maximal  error is greater or equal to . We generalize this result to the case of the  error.

###### Theorem 2.1.

For any nonsequential Monte Carlo methods with sample size ,

 ep(^Sn)≥(12)2+1/p1n.

Observe that Novak’s lower bound is recovered for . Using Theorem 2.1 with we can deduce of lower bound for the variance of unbiased nonsequential methods.

###### Corollary 2.2.

For any unbiased nonsequential Monte Carlo method with sample size ,

 supf∈Fvar(^Sn(f))≥132n2.
###### Proof of Theorem 2.1.

Consider a nonsequential Monte Carlo methods with evaluation points , …, and estimator . Divide the interval into equal subintervals of length : then at least one of the subintervals, call it , will contain no evaluation point with probability at least . Now construct two functions that are both equal to zero on the left of , equal to one on the right, and such that and on . Then , and on the event , since  and  coincide outside of . It follows that

 (ep(^Sn))p ≥supf∈{f1,f2}E(|S(f)−^Sn(f)|p)≥122∑j=1E(|S(fj)−^Sn(fj)|p) ≥122∑j=1E(|S(fj)−^Sn(fj)|p⋅1A)=122∑j=1E(|S(fj)−T|p⋅1A),

where denotes the common value of  and  on . We conclude that

 (ep(^Sn))p≥12|I|p2p−1P(A)≥(12)2p+1⋅1np,

using the fact that, for any and , . ∎

## 3 Uniform i.i.d. sampling

The simple Monte Carlo method is the most common example of a nonsequential method: the evaluation points , …, are drawn independently, uniformly in , and then the integral is estimated by . The estimator is clearly unbiased, and it follows from Popoviciu’s inequality—i.e., for any random variable  taking values in —that

 (e2(^SMCn))2=maxf∈Fvar(^SMCn(f))=14n.

The maximal error is attained when  is a unit step function jumping at . It turns out that a smaller error can be achieved, for the same (uniform i.i.d.) sampling scheme, using the control variate technique. More specifically, we consider the control variate and set

 ^Scvn(f)=1nn∑i=1(f(Xi)−~f(Xi))+12.
###### Theorem 3.1.

The estimator  is unbiased, and satisfies

 (e2(^Scvn))2=maxf∈Fvar(^Scvn(f))=112n.

The maximal error is attained for any unit step function.

###### Proof.

The estimator is unbiased since , and therefore the mean-squared error is equal to . For a unit step function with a jump at , the random variable is uniformly distributed over , which yields as claimed. It remains to show that for all .

Let denote the class of all non-decreasing staircase functions of the form , with . For any , consider the piecewise-constant approximation defined by averaging  over each subinterval of length . Then, and . Thus,

 supf∈Fvar(f(X)−X)=limm→∞supf∈Fmvar(f(X)−X). (3)

Let us now show that is maximized over  when  is a unit step function. Pick any . Set and . If is not a unit step function, then there exist such that and . Denote by  the function obtained by changing the common value of , …, in  to . The variance is a convex function of , since it can be expanded as with . Consequently, we have at one of the two endpoints of . Note that the corresponding staircase function  has one fewer step than . Iterating as necessary, we conclude that for any  there exists a unit step function  such that . Therefore , which completes the proof. ∎

## 4 Stratified sampling

Consider now a stratified sampling estimator with  strata:

 ^Sstrn(f)=K∑k=1wk⋅1nknk∑i=1f(Xk,i), (4)

where the -th stratum is , , the weight is the length of the -th stratum, the allocation scheme is such that for all  and , and the random variables  are independent, with the s uniformly distributed in . Note that the sampling points are no longer identically distributed here. The estimator is unbiased, with variance

 var(^Sstrn(f))=K∑k=1w2knkvar(f(Xk,1)). (5)
###### Theorem 4.1.

For any , any choice of strata and any allocation scheme, the stratified sampling estimator (4) satisfies

 (e2(^Sstrn))2=maxf∈Fvar(^Sstrn(f))=14maxkw2knk, (6)

The maximal error is attained for a unit step function with a jump at the middle of , where . The minimal value of the maximal error (6) is , and is obtained with strata of equal lengths ( and for all ).

The optimal stratified sampling method can be seen as a one-dimensional special case of the Latin Hypercube Sampling (LHS) method (McKay et al., 1979). Novak (1992) relies on this method as a building block for the construction of a rate-optimal sequential method. (On a related note, McKay et al. (1979) prove that, in any dimension, the LHS method is preferable to the simple Monte Carlo method if the function is monotone in each of its arguments.)

###### Proof.

For all , let . For a given stratified sampling method with  strata, for all , define

 FΔ––={f∈F∣∣∀k∈{1,…,K},f(xk)−f(xk−1)=Δk}.

Then it follows from (5) and Popoviciu’s inequality that

 maxf∈FΔ––var(^Sstrn(f))=14K∑k=1w2kΔ2knk, (7)

where the maximum is attained for a non-decreasing staircase function with jumps of height  at the middle of the strata. Note that . Therefore, the right-hand side of (7) is upper-bounded by , which is indeed the value of the variance (5) when is a unit step function with a jump at the middle of the stratum where is the largest.

In order to prove the second part of the claim, observe first that any stratum such that  can be further divided into  sub-strata of equal lengths without increasing the upper bound. Considering then the case where and for all , the upper bound reduces to , which is minimal when since . ∎

## 5 Discussion

The stratified sampling (LHS) method provides the best-known variance upper bound over the class  for an unbiased nonsequential method as soon as , but is outperformed by the control variate method of Section 3 when

. We do not know at the moment if these results are optimal in the class of unbiased nonsequential methods. (The ratio between the best variance upper bound and the lower bound of Corollary

2.2 is for , for  and for .)

Relaxing the unbiasedness requirement, it turns out that both methods are outperformed for all  by the (deterministic) trapezoidal method discussed in the introduction, which has a worst-case squared error of . The ratio of worst-case mean-squared errors, however, is never very large—at most —and goes to  when  goes to infinity.

## References

• Perrin et al. (2014) F. Perrin, F. Tenenhaus-Aziza, V. Michel, S. Miszczycha, N. Bel, and M. Sanaa. Quantitative risk assessment of haemolytic and uremic syndrome linked to O157:H7 and non-O157:H7 shiga-toxin producing escherichia coli strains in raw milk soft cheeses. Risk Analysis, 35(1):109–128, 2014.
• Kiefer (1957) J. Kiefer. Optimum sequential search and approximation methods under minimum regularity assumptions. Journal of the Society for Industrial and Applied Mathematics, 5(3):105–136, 1957.
• Novak (1992) E. Novak. Quadrature formulas for monotone functions. Proceedings of the American Mathematical Society, 115(1):59–68, 1992.
• McKay et al. (1979) M. D. McKay, R. J. Beckman, and Conover W. J. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 21(2):239–245, 1979.