Toeplitz Monte Carlo

03/09/2020
by   Josef Dick, et al.
UNSW
The University of Tokyo
0

Motivated mainly by applications to partial differential equations with random coefficients, we introduce a new class of Monte Carlo estimators, called Toeplitz Monte Carlo (TMC) estimator for approximating the integral of a multivariate function with respect to the direct product of an identical univariate probability measure. The TMC estimator generates a sequence x_1,x_2,... of i.i.d. samples for one random variable, and then uses (x_n+s-1,x_n+s-2...,x_n) with n=1,2,... as quadrature points, where s denotes the dimension. Although consecutive points have some dependency, the concatenation of all quadrature nodes is represented by a Toeplitz matrix, which allows for a fast matrix-vector multiplication. In this paper we study the variance of the TMC estimator and its dependence on the dimension s. Numerical experiments confirm the considerable efficiency improvement over the standard Monte Carlo estimator for applications to partial differential equations with random coefficients, particularly when the dimension s is large.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

06/23/2021

A note on concatenation of quasi-Monte Carlo and plain Monte Carlo rules in high dimensions

In this short note, we study a concatenation of quasi-Monte Carlo and pl...
11/05/2019

Quasi-Monte Carlo sampling for machine-learning partial differential equations

Solving partial differential equations in high dimensions by deep neural...
06/11/2020

Walsh functions, scrambled (0,m,s)-nets, and negative covariance: applying symbolic computation to quasi-Monte Carlo integration

We investigate base b Walsh functions for which the variance of the inte...
02/23/2021

Product-form estimators: exploiting independence to scale up Monte Carlo

We introduce a class of Monte Carlo estimators for product-form target d...
08/19/2020

Monte Carlo construction of cubature on Wiener space

In this paper, we investigate application of mathematical optimization t...
10/04/2018

Monte Carlo Dependency Estimation

Estimating the dependency of variables is a fundamental task in data ana...
10/30/2021

On quadrature rules for solving Partial Differential Equations using Neural Networks

Neural Networks have been widely used to solve Partial Differential Equa...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The motivation of this research mainly comes from applications to uncertainty quantification for ordinary or partial differential equations with random coefficients. The problem we are interested in is to estimate an expectation (integral)

for large with

being the univariate probability density function defined over

. In some applications, the integrand is of the form

for a matrix and a function , see Dick et al. (2015). Here we note that is defined as a row vector. Typically,

is given by the uniform distribution on the unit interval

, or by the standard normal distribution on the real line

.

The standard Monte Carlo method approximates as follows: we first generate a sequence of i.i.d. samples of the random variables :

and then approximate by

(1)

It is well known that

and

which ensures the canonical “one over square root of ” convergence.

Now let us consider a situation where computing for takes a significant amount of time in the computation of . In general, if the matrix does not have any special structure such as circulant, Hankel, Toeplitz, or Vandermonde, then fast matrix-vector multiplication is not available and the computation of requires arithmetic operations. Some examples where a fast matrix-vector multiplication has been established are the following: In Feischl et al. (2018) the authors use -matrices to obtain an approximation of a covariance matrix which also permits a fast matrix vector multiplication; In Giles et al. (2008) the authors show how a (partially) fast matrix vector product can be implemented for multi-asset pricing in finance; Brownian bridge and principle component analysis factorizations of the covariance matrix in finance also permit a fast matrix vector multiplication (Giles et al. 2008). Here we consider the case where either a fast matrix-vector product is not available, or one wants to avoid -matrices and particular covariance factorizations, since we do not impose any restrictions on .

In order to reduce this computational cost, we propose an alternative, novel Monte Carlo estimator in this paper. Instead of generating a sequence of i.i.d. samples of the vector , we generate a sequence of i.i.d. samples of a single random variable, denoted by , and then approximates by

with

The computation of can be done as follows:

Algorithm 1

For , let be i.i.d. samples of a random variable following .

  1. Define by

    Note that is a Toeplitz matrix.

  2. Compute

  3. Then is given by

The idea behind introducing this algorithm comes from a recent paper by Dick et al. (2015) who consider replacing the point set used in the standard Monte Carlo estimator (1) with a special class of quasi-Monte Carlo point sets which permit a fast matrix-vector multiplication for . This paper considers a sampling scheme different from Dick et al. (2015) while still allowing for a fast matrix-vector multiplication.

When is quite large, say thousands or million, has to be set significantly smaller than . Throughout this paper we consider the case where for some . Since the matrix-vector multiplication between a Toeplitz matrix and each column vector of can be done with

arithmetic operations by using the fast Fourier transform

(Frigo and Johnson 2005), the matrix-matrix multiplication appearing in the second item of Algorithm 1 can be done with arithmetic operations. This way the necessary computational cost can be reduced from to , which is the major advantage of using .

In this paper we call a Toeplitz Monte Carlo (TMC) estimator of as we rely on the Toeplitz structure of to achieve a faster computation.111If the sample nodes are given by instead of , the matrix becomes a Hankel matrix, which also allows for a fast matrix-vector multiplication. Therefore we can call our proposal a Hankel Monte Carlo (HMC) estimator instead. However, in the context of Monte Carlo methods, HMC often refers to the Hamiltonian Monte Carlo algorithm, and we would like to avoid duplication of the abbreviations by coining the name Toeplitz Monte Carlo. In the remainder of this paper, we study the variance of the TMC estimator and its dependence on the dimension , and also see practical efficiency of the TMC estimator by carrying out numerical experiments for applications from ordinary/partial differential equations with random coefficients.

2 Theoretical results

2.1 Variance analysis

In order to study the variance of , we introduce the concept of the analysis-of-variance (ANOVA) decomposition of multivariate functions (Hoeffding 1948, Kuo et al. 2010, Sobol’ 1993). In what follows, for simplicity of notation, we write . For a subset , we write and denote the cardinality of by . Let be a square-integrable function, i.e., . Then can be decomposed into

where we write and each summand is defined recursively by and

for . Regarding this decomposition of multivariate functions, the following properties hold. We refer to Lemmas A.1 & A.3 of Owen (2019) for the proof of the case where is the uniform distribution over the unit interval .

Lemma 1

With the notation above, we have:

  1. For any non-empty and ,

  2. For any ,

It follows from the second assertion of Lemma 1 that

This equality means that the variance of can be expressed as a sum of the variances of the lower-dimensional functions:

(2)

Using these facts, the variance of the TMC estimator can be analyzed as follows:

Theorem 2.1

We have

and

where we write .

Note that, in the theorem, we write

The readers should not be confused with

Proof

The first assertion follows immediately from the linearity of expectation and the trivial equality . For the second assertion, by using the ANOVA decomposition of , we have

It follows from the first assertion of Lemma 1 that the second term on the right-most side above becomes

where we reordered the sum over and with respect to the difference in the last equality.

If , there is no overlapping of the components between and . Because of the independence of samples, it follows from the first assertion of Lemma 1 that the inner sum over and above is given by

If , on the other hand, we have for any . With this equality and the first assertion of Lemma 1, the inner sum over and becomes

Altogether we obtain

Thus we are done.∎

As is clear from Theorem 2.1, the TMC estimator is unbiased and maintains the canonical “one over square root of ” convergence. Moreover, the TMC estimator can be regarded as a variance reduction technique since the second term on the variance can be negative, depending on the function.

Example 1

To illustrate the last comment, let us consider a simple test function given by

and let be normally distributed independent random variables with mean 0 and variance 1. It is easy to see that

Then it follows that

whereas, for , we have

Therefore the variance of the TMC estimator is almost one-third of the variance of the standard Monte Carlo estimator.

It is also possible that the variance of the TMC estimator increases compared to standard Monte Carlo, however, we show below that this increase is bounded.

2.2 Weighted space and tractability

Here we study the dependence of the variance on the dimension . For this purpose, we first give a bound on . For , let

Then it follows from the decomposition (2) that

resulting in an equality

Using Theorem 2.1, the variance is bounded above as follows.

Corollary 1

We have

Proof

For any , the Cauchy-Schwarz inequality leads to

Applying this bound to the second assertion of Theorem 2.1, we obtain

Using this result, we have

wherein, for the second inequality, the equality is attained if and only if . Therefore, when we fix the number of samples, the variance of the TMC estimator can at most be times larger than the variance of the standard Monte Carlo estimator.

Now let us consider the case and assume, as discussed in the first section, that the computational time for the standard Monte Carlo estimator is proportional to , whereas the computational time for the TMC estimator is proportional to (assuming that the main cost in evaluating lies in the computation of ). When we fix the cost instead of the number of samples, we have

where indicates that the terms should be of the same order, and so

Thus, the variance of the TMC estimator for a given cost is at most times as large as the standard Monte Carlo estimator (up to some constant factor). On the other hand, if there is some decay of the importance of the ANOVA terms as the index of the variable increases, for instance, if the first few terms in dominate the sum, then the ratio can be bounded independently of , leading to a gain in the efficiency of the TMC estimator. We observe such a behaviour in our numerical experiments below.

Following the idea from Sloan and Woźniakowski (1998), we now introduce the notion of a weighted space. Let be a sequence of the non-negative real numbers called weights. Then the weighted space is defined by

where

For any subset with , we assume that the corresponding ANOVA term is 0 and we formally set .

For a randomized algorithm using function evaluations of to estimate , which we denote by , let us consider the minimal cost to estimate with mean square error for any in the unit ball of :

where

We say that the algorithm is

  • a weakly tractable algorithm if

  • a polynomially tractable algorithm if there exist non-negative constants , such that

    holds for all , where and are called the -exponent and the -exponent, respectively,

  • a strongly polynomially tractable algorithm if is a polynomially tractable algorithm with the -exponent 0.

We refer to Novak and Woźniakowski (2010) for more information on the notion of tractability.

For instance, the standard Monte Carlo estimator is a strongly polynomially tractable algorithm with -exponent 2 if

(3)

holds. This claim can be proven as follows:

It follows that, in order to have

for any with , we need . Thus the minimal cost is bounded above by

Given the condition (3), we see that is bounded independently of the dimension and the algorithm is strongly polynomially tractable with the -exponent 2.

The following theorem gives the necessary conditions on the weights for the TMC estimator to be a weakly tractable algorithm, a polynomially tractable algorithm, or a strongly polynomially tractable algorithm.

Theorem 2.2

The TMC estimator is

  • a weakly tractable algorithm if

  • a polynomially tractable algorithm with the -exponent 2 if there exists such that

  • a strongly polynomially tractable algorithm with the -exponent 2 if

Proof

It follows from Corollary 1 and Hölder’s inequality for sums that