On Some Resampling Procedures with the Empirical Beta Copula

The empirical beta copula is a simple but effective smoother of the empirical copula. Because it is a genuine copula, from which, moreover, it is particularly easy to sample, it is reasonable to expect that resampling procedures based on the empirical beta copula are expedient and accurate. In this paper, after reviewing the literature on some bootstrap approximations for the empirical copula process, we first show the asymptotic equivalence of several bootstrapped processes related to the empirical copula and empirical beta copula. Then we investigate the finite-sample properties of resampling schemes based on the empirical (beta) copula by Monte Carlo simulation. More specifically, we consider interval estimation for some functionals such as rank correlation coefficients and dependence parameters of several well-known families of copulas, constructing confidence intervals by several methods and comparing their accuracy and efficiency. We also compute the actual size and power of symmetry tests based on several resampling schemes for the empirical copula and empirical beta copula.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

12/31/2014

The continuum-of-urns scheme, generalized beta and Indian buffet processes, and hierarchies thereof

We describe the combinatorial stochastic process underlying a sequence o...
09/02/2011

The Stick-Breaking Construction of the Beta Process as a Poisson Process

We show that the stick-breaking construction of the beta process due to ...
08/31/2020

Precision for binary measurement methods and results under beta-binomial distributions

To handle typical problems from fields dealing with biological responses...
11/09/2020

Time-Invariance Coefficients Tests with the Adaptive Multi-Factor Model

The purpose of this paper is to test the multi-factor beta model implied...
06/20/2021

Some smooth sequential empirical copula processes and their multiplier bootstraps under strong mixing

A broad class of smooth empirical copulas that contains the empirical be...
11/30/2020

The statistical properties of RCTs and a proposal for shrinkage

We abstract the concept of a randomized controlled trial (RCT) as a trip...
02/23/2018

On detecting changes in the jumps of arbitrary size of a time-continuous stochastic process

This paper introduces test and estimation procedures for abrupt and grad...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Let ,

, be independent and identically distributed random vectors, and assume that the cumulative distribution function,

, of is continuous. By Sklar’s theorem Sklar59 , there exists a unique copula, , such that

where is the th marginal distribution function of . In fact, in the continuous case, we have for , where is the generalized inverse of a distribution function . The empirical copula Deheu79 is defined by

where, for ,

For and , let be the rank of among ; namely,

(1)

Frequently used is a rank-based version of the empirical copula given by

(2)

In the absence of ties, we have

(3)

Both functions and are piecewise constant and cannot be genuine copulas. When the sample size is small, they suffer from the presence of ties when used in resampling.

The empirical beta copula SST2017 is a simple but effective way of correcting and smoothing the empirical copula. Its definition will be given in Section 3. Even though its asymptotic distribution is the same as the one of the usual empirical copula, its accuracy in small samples is usually better, among others because it is itself always a genuine copula. Moreover, drawing random samples from the empirical beta copula is quite straightforward.

Because of these properties, it is reasonable to expect that simple and accurate resampling schemes for the empirical copula process can be constructed based on the empirical beta copula. For tail copulas, a simulation study in Kiriliouk-Segers-Tafakori2018 showed that the bootstrap based on the empirical beta copula worked significantly better than the direct multiplier bootstrap from Buech-Dette2010 . The purpose of this paper is to investigate further both the finite-sample and asymptotic behavior of this resampling method, but then for general copulas.

The paper is structured as follows. In Section 2, we review and discuss the literature on resampling methods for the empirical copula process. The asymptotic properties of two resampling procedures based on the empirical beta copula are investigated in Section 3. In Section 4, extensive simulation studies are conducted to demonstrate the effectiveness of resampling procedures based on the empirical beta copula for constructing confidence intervals for several copula functionals and for testing shape constraints on the copula. We conclude the paper with some discussion and open questions in Section 5. All proofs are grouped together in the Appendix.

2 Review on bootstrapping empirical copula processes

In this section, we give a short review on bootstrapping empirical copula processes, incorporating some newer improvements. We limit ourselves to i.i.d. sequences and note that extensions to stationary time series have been considered in Buech-Volg2013 , among others.

First we recall a basic result on the weak convergence of the empirical copula process. Let be the Banach space of real-valued, bounded functions on , equipped with the supremum norm . The arrow denotes weak convergence in the sense used in Vaart-Wellner . The following condition is the only one needed for our convergence results.

Condition 2.1

For each , the copula has a continuous first-order partial derivative on the set .

The following theorem is proved in Segers2012 . Let denote a -pinned Brownian sheet, i.e., a centered Gaussian process on with continuous trajectories and covariance function

(4)
Theorem 2.2

Suppose Condition 2.1 holds. Then we have

in , where

with appearing at the -th coordinate.

Next we introduce notation for the convergence of conditional laws in probability given the data as defined in

Kosorok2008 ; see also (Vaart-Wellner, , Section 2.9). Let

(5)

If is a sequence of bootstrapped processes in with random weights , then the notation

(6)

means that

(7)

Here the notation indicates conditional expectation over the weights given the data , and and denote the minimal measurable majorant and maximal measurable minorant, respectively, with respect to the joint data .

In the sequel, the random weights

can signify different things: a multinomial random vector when drawing from the data with replacement, independent and identically distributed multipliers in the multiplier bootstrap, or vectors of order statistics from the uniform distribution when resampling from the empirical beta copula. In (

6), the symbol will then be changed accordingly.

2.1 Straightforward bootstrap

Let be a multinomial random vector with probabilities , independent of the sample . Set

where

We can also define the bootstrapped version of the rank-based empirical copula

(8)

where

(9)

Since a bootstrap sample will have ties with a (large) positive probability, the bound (3) is no longer valid for and . But we can prove the following.

Proposition 2.3

We have

(10)

The proof of Proposition 2.3 is given in the Appendix. Convergence in probability of the conditional laws

in the space was shown in Fer-Rad-Weg04 under the condition that all partial derivatives exist and are continuous on and in Buech-Volg2013 under the weaker Condition 2.1. Because of (3) and Proposition 2.3, it also holds that

(11)

2.2 Multiplier bootstrap with estimated partial derivatives

The multiplier bootstrap for the empirical copula proposed by Remillard-Scaillet2009 has proved useful for many problems. In Buech-Dette2010 it was found to have a better finite-sample performance than other resampling methods for the empirical copula process. We present a modified version given by Buech-Dette2010 that we employ for the simulation studies in Section 4.

Let

be independent and identically distributed non-negative random variables, independent of the data, with

, and . Put , and set

Define and . Using Theorem 2.6 in Kosorok2008 and the a.s. convergence , where is the identity function on , we can show that

Hence if is the estimate for , applying finite differencing to the empirical copula at a spacing proportional to , then the processes

give conditional approximations of . Namely, we have

3 Resampling with the empirical beta copula

The empirical beta copula SST2017 is defined as

where denote the ranks as in (1) and where, for and ,

(12)

is the cumulative distribution function of the beta distribution

. In this section, we examine the asymptotic properties of two resampling procedures based on the empirical beta copula.

3.1 Standard bootstrap for the empirical beta copula

Let be a multinomial random vector with success probabilities , independent of the original sample. Set

where are the bootstrapped ranks in (9). Let , for , be independent binomial random variables. Let denote expectation with respect to , conditionally on the sample and the multinomial random vector. It follows that

where is the bootstrapped rank-based empirical copula in (8). Similarly, the empirical beta copula is

where is the rank-based empirical copula in (2). Consider the bootstrapped processes defined in (11) and . We find

(13)

From the weak convergence of the bootstrapped process , we will prove the following proposition. As a consequence, consistency of the bootstrapped process of the (rank-based) empirical copula in (11) entails consistency of the one for the empirical beta copula.

Proposition 3.1

Under Condition 2.1, we have

(14)

and thus as .

3.2 Bootstrap by drawing samples from the empirical beta copula

The original motivation of SST2017 was resampling; the uniform random variables generated independently and rearranged in the order specified by the componentwise ranks of the original sample might in some sence be considered as a bootstrap sample. Although this idea turned out to be not entirely correct, it was still how the empirical beta copula was discovered originally. In the same spirit, it is natural to study the bootstrap method based on drawing samples from the empirical beta copula .

It is in fact very simple to generate a random variate from .

Algorithm 3.2

Given the ranks , of the original sample:

  1. Generate from the discrete uniform distribution on .

  2. Generate independently ,  .

  3. Set .

Repeating the above algorithm times independently, we get a sample of independent random vectors drawn from , conditionally on the data . Let this sample be denoted by , . We can think of this procedure as a kind of smoothed bootstrap (see Efron1982siam , (Shao-Tu95, , Section 3.5)) because the empirical beta copula may be thought of as a smoothed version of the empirical copula.

The joint and marginal empirical distribution functions of the bootstrap sample are

The ranks of the bootstrap sample are given by

(15)

These yield bootstrapped versions of the Deheuvels empirical copula, the rank-based empirical copula and the empirical beta copula:

Proposition 3.3

Assume Condition 2.1. Then as , we have conditional weak convergence in probability as defined in (6) with respect to the random vectors of the bootstrapped empirical copula processes

to the limit process defined in Theorem 2.2.

3.3 Approximating sampling distributions of rank statistics by resampling from the empirical beta copula

Statistical inference for often involves rank statistics. One way to justify this is to appeal to the invariance of under coordinatewise continuous strictly increasing transformations. Let us hence consider a rank statistic , where is a vector consisting of the coordinatewise ranks of . Below we suggest a way of approximating its distribution by drawing a sample from and computing “bootstrap replicates”. This also avoids problems with ties encountered when drawing with replacement from the original data. Specifically, our procedure goes as follows.

Algorithm 3.4 (Smoothed beta bootstrap)

Given :

  1. Apply Algorithm 3.2 times independently to obtain a bootstrap sample drawn from , compute their ranks as in (15) and put .

  2. Repeat Step 1 a moderate to large number, , of times to get bootstrap replicates .

  3. Use to approximate the sampling distribution of .

The validity of this procedure follows from our claim in the preceding subsection. Because all the related empirical copula processes are asymptotically equivalent, we need to look into the small-sample performance of the methods. In Subsection 4.2, we construct confidence intervals for some copula functionals by popular rank statistics.

4 Simulation Studies

We assess the performance of the bootstrap methods presented in Sections 2 and 3 in a wide range of applications. In all of the experiments below, the number of Monte Carlo runs and the number of bootstrap replications are both set to . The nominal confidence level is always 0.95 and we use Clayton, Gumbel-Hougaard, Frank and Gauss copula families, see e.g. Nelsen2006 . Most simulations are done in R with the package copula copulaR , except for Subsection 4.2, where MATLAB code was used.

4.1 Covariance of the limiting process

We compare the estimated covariances of the limiting process based on the standard and smoothed beta bootstrap methods with the partial derivatives multiplier method, which in Buech-Dette2010 is shown to perform better than the straightforward bootstrap or the direct multiplier method. We follow the set-up in Buech-Dette2010 , evaluating the covariance at four points for in the unit square. The variables are such that for . For the bivariate Clayton copula with parameter , Table 1 shows the mean squared error of the estimated covariance based on the partial derivative multiplier method , the standard beta bootstrap and the smoothed beta bootstrap for and . Results for have been copied from Tables 3 and 4 in Buech-Dette2010 . Both methods based on the empirical beta copula outperform the multiplier method in all points but and ).

0.8887 0.5210 0.5222 0.3716 0.4595 0.2673 0.2798 0.1961
1.0112 0.1799 0.2988 0.5211 0.1069 0.1577
0.9899 0.2818 0.5092 0.1681
0.6250 0.2992
0.9992 0.3402 0.3473 0.1956 0.6205 0.2427 0.2383 0.1547
0.7887 0.1294 0.1889 0.4933 0.0857 0.1366
0.7644 0.1821 0.4898 0.1376
0.7108 0.4183
1.2248 0.2929 0.2924 0.1456 0.6761 0.1874 0.1888 0.1128
0.8461 0.0992 0.1691 0.4814 0.0703 0.1071
0.8856 0.1682 0.4956 0.1149
1.1209 0.5913
Table 1: Mean squared error of the covariance estimates for the bivariate Clayton copula with .

4.2 Confidence intervals for rank correlation coefficients

We assess the performance of the straightforward bootstrap and the smoothed beta bootstrap (Subsections 2.1 and 3.3) for constructing confidence intervals for two popular rank correlation coefficients for bivariate distributions, Kendall’s and Spearman’s , which are known to depend only on the copula associated with .

The population Kendall’s is defined by

In terms of

the sample Kendall’s is given by

. Its asymptotic variance may be estimated by

where , and (see Hol-Wol-Chi2014 ). An asymptotic confidence interval for is thus given by , with

the usual standard normal tail quantile.

This interval can be compared to the confidence intervals obtained by our resampling methods. Table 2 shows the coverage probabilities and the average lengths of the estimated confidence intervals based on the asymptotic distribution, the straightforward bootstrap and the smoothed beta bootstrap for the independence copula () and the Clayton copula with () and (. The smoothed beta bootstrap gives the most conservative coverage probabilities, but the shortest length among the three.

40 60 80 100 40 60 80 100 40 60 80 100
coverage asymp 0.952 0.930 0.941 0.959 0.946 0.931 0.937 0.943 0.933 0.941 0.939 0.926
probability boot 0.957 0.937 0.942 0.963 0.949 0.940 0.949 0.949 0.951 0.947 0.938 0.935
beta 0.964 0.949 0.949 0.966 0.952 0.947 0.954 0.955 0.963 0.935 0.948 0.939
average asymp 0.449 0.355 0.304 0.271 0.364 0.287 0.245 0.217 0.378 0.302 0.257 0.227
length boot 0.450 0.357 0.306 0.272 0.366 0.288 0.246 0.218 0.380 0.304 0.258 0.228
beta 0.433 0.347 0.299 0.268 0.350 0.279 0.240 0.213 0.365 0.294 0.253 0.224
Table 2: Coverage probabilities and average lengths of confidence intervals for Kendall’s for the Clayton copula family computed via the normal approximation, the straightforward bootstrap, and the smoothed beta bootstrap

The population Spearman’s and the sample Spearman’s rho are given by

The limiting distribution of equals that of , so it is possible in principle to construct confidence intervals based on this asymptotics. However, unlike the case of , it is cumbersome and involves the partial derivatives of , which must be estimated, so we omit it from our study here. In Table 3, one can see that the coverage probabilities are more conservative for the smoothed beta bootstrap than for the straightforward bootstrap, but the average lengths of the estimated confidence intervals are very similar for both methods. This could be due to the fact that , as can be directly computed.

40 60 80 100 40 60 80 100 40 60 80 100
coverage boot 0.956 0.943 0.953 0.951 0.959 0.953 0.949 0.952 0.952 0.954 0.960 0.956
probability beta 0.965 0.946 0.957 0.956 0.961 0.958 0.960 0.952 0.969 0.957 0.964 0.958
average boot 0.634 0.514 0.444 0.397 0.524 0.424 0.367 0.326 0.519 0.418 0.366 0.324
length beta 0.625 0.510 0.442 0.395 0.522 0.424 0.368 0.325 0.519 0.418 0.367 0.324
Table 3: Coverage probabilities and average lengths of confidence intervals for Spearman’s for the Clayton copula family based on the straightforward bootstrap and the smoothed beta bootstrap.

4.3 Confidence intervals for a copula parameter

Suppose that the copula of is parametrized by , so that . When the ’s are unknown, the resulting problem of estimating is semiparametric and is studied in Gen-Gho-Riv95 ; Tsuka05 . Assume that is absolutely continuous with density , which is differentiable with respect to . Replacing the unknown ’s in the score equation by their (rescaled) empirical counterparts, one gets the estimating equation

(16)

where . The solution to (16) is called the pseudo-likelihood estimator.

We compare confidence intervals for when estimated by the pseudo-likelihood estimator based on the asymptotic variance given in Gen-Gho-Riv95 , the straightforward bootstrap, the smoothed beta bootstrap and the classic parametric bootstrap. Tables 4 and 5 show the estimated coverage probabilities and average interval lengths of the confidence intervals for the Clayton, Gauss, Frank and Gumbel–Hougaard copula families. For the Clayton copula, the smoothed beta bootstrap gives the shortest intervals both for and , but only for the coverage probabilities are too liberal, which is somewhat puzzling. For the Frank and Gumbel–Hougaard copulas, the smoothed beta bootstrap gives the most conservative coverage probabilities, but the shortest length among the four. For the Gauss copula, the asymptotic approximation gives significantly smaller coverage probabilities than the nominal value 0.95.

40 60 80 100 40 60 80 100
coverage asymp 0.954 0.969 0.960 0.965 0.951 0.940 0.940 0.946
probability boot 0.953 0.943 0.944 0.943 0.968 0.952 0.953 0.951
beta 0.953 0.964 0.957 0.952 0.933 0.904 0.908 0.906
param 0.924 0.923 0.933 0.948 0.957 0.951 0.955 0.953
average asymp 2.011 1.632 1.354 1.237 2.764 2.142 1.821 1.615
length boot 1.894 1.449 1.198 1.046 2.991 2.205 1.841 1.626
beta 1.517 1.225 1.050 0.935 1.957 1.612 1.420 1.296
param 1.914 1.448 1.222 1.070 2.821 2.150 1.829 1.617
Table 4: Coverage probabilities and average lengths of confidence intervals for the parameter of the Clayton copula with () and (). Intervals computed via the asymptotic normal approximation, the straightforward bootstrap, the smoothed beta bootstrap, and the parametric bootstrap.
Gauss Frank Gumbel–Hougaard
40 60 80 100 40 60 80 100 40 60 80 100
coverage asymp 0.881 0.895 0.910 0.928 0.941 0.950 0.948 0.965 0.954 0.940 0.940 0.955
probability boot 0.942 0.944 0.947 0.950 0.957 0.956 0.946 0.963 0.965 0.951 0.953 0.965
beta 0.968 0.962 0.970 0.953 0.965 0.961 0.952 0.965 0.970 0.951 0.952 0.954
param 0.903 0.921 0.923 0.930 0.938 0.956 0.941 0.962 0.924 0.926 0.932 0.945
average asymp 0.303 0.274 0.213 0.193 5.699 4.487 3.821 3.391 1.425 1.082 0.929 0.816
length boot 0.319 0.257 0.219 0.197 6.139 4.677 3.949 3.464 1.572 1.162 0.968 0.855
beta 0.341 0.269 0.228 0.203 5.367 4.335 3.735 3.329 1.170 0.947 0.826 0.747
param 0.292 0.242 0.210 0.191 5.729 4.494 3.848 3.389 1.546 1.170 0.983 0.869
Table 5: Coverage probabilities and average lengths of confidence intervals for the parameter of the Gaussian copula with , the Frank copula with and the Gumbel–Hougaard copula with . All copulas have . Intervals computed via the asymptotic normal approximation, the straightforward bootstrap, the smoothed beta bootstrap, and the parametric bootstrap.

4.4 Testing symmetry of a copula

For a bivariate copula , consider the problem of testing the symmetry hypothesis for all

. We focus on two test statistics proposed in

Gen-Nes-Que2012 ,

and also include a version of based on the empirical beta copula, i.e.,

Similarly as in Proposition 1 in Gen-Nes-Que2012 , the statistic can be computed via

with for and as in (12). For fixed , the matrix can be precomputed and stored, reducing the computation time for the resampling methods. A similar modification of into is obviously possible as well, but would be computationally more demanding.

In order to compute -values, we need to generate bootstrap samples from a distribution fulfilling the restriction specified by . A natural candidate is a ‘symmetrized’ version of the empirical beta copula

When resampling, this simply amounts to interchanging the two coordinates at random in step 3 of Algorithm 3.2. We employ the following three resampling schemes for comparison of actual sizes of the tests.

  • The symmetrized smoothed beta bootstrap: we resample from to get bootstrap replicates of , , and ;

  • The symmetrized version of the straightforward bootstrap for and ;

  • exchTest in the R package copula copulaR , which implements the multiplier bootstrap for and as described in Gen-Nes-Que2012 and in Section 5 of Kojadinovic-Yan2012 . For , the grid length in exchTest is set to .

Tables 6 and 7 show the actual sizes of the symmetry tests for the Clayton and Gauss copulas. On the whole, the smoothed beta bootstrap works better than exchTest or equally well both for and , except when dependence is strong () and the sample size is small (), although no method produces a satisfying result in this case. The smoothed beta bootstrap with produces actual sizes similar to those with . The statistic performs slightly better than on average, especially for strong positive dependence. The straightforward bootstrap performs poorly in all cases, which is as expected Remillard-Scaillet2009 .

To compare the power of the tests, the Clayton and Gauss copulas are made asymmetric by Khoudraji’s device Khoudraji1995 , that is, the asymmetric version of a copula is defined as

Table 8 shows the empirical power of and for for the three resampling methods. We see that the smoothed beta bootstraps with and have higher power than exchTest for almost all sample sizes and parameter values considered, and among them, the smoothed beta bootstrap with has a slightly higher power in almost all cases.

50 100 200 400 50 100 200 400
exchTest 0.055 0.033 0.039 0.040 exchTest 0.044 0.035 0.039 0.051
boot 0.021 0.024 0.031 0.034 boot 0.009 0.019 0.027 0.044
beta 0.057 0.038 0.039 0.041 beta 0.046 0.035 0.042 0.059
beta2 0.050 0.037 0.041 0.059
exchTest 0.039 0.029 0.036 0.036 exchTest 0.030 0.022 0.040 0.031
boot 0.009 0.015 0.026 0.034 boot 0.001 0.011 0.020 0.030
beta 0.039 0.039 0.044 0.046 beta 0.042 0.032 0.041 0.043
beta2 0.033 0.033 0.044 0.045
exchTest 0.033 0.020 0.026 0.019 exchTest 0.015 0.014 0.030 0.031
boot 0.001 0.008 0.015 0.017 boot 0.001 0.005 0.017 0.025
beta 0.030 0.029 0.039 0.028 beta 0.020 0.022 0.040 0.047
beta2 0.019 0.023 0.045 0.046
exchTest 0.025 0.026 0.018 0.014 exchTest 0.000 0.007 0.007 0.012
boot 0.000 0.002 0.001 0.008 boot 0.000 0.000 0.001 0.004
beta 0.006 0.017 0.026 0.029 beta 0.000 0.006 0.017 0.029
beta2 0.000 0.007 0.021 0.036
Table 6: Actual sizes of symmetry tests based on and for the Clayton copula (), with -values computed by the multiplier bootstrap (exchTest), the straightforward bootstrap (boot), the smoothed beta bootstrap (beta) and of the test based on , with -values computed by the smoothed beta bootstrap (beta2). The nominal size is
50 100 200 400 50 100 200 400
exchTest 0.047 0.032 0.038 0.039 exchTest 0.026 0.037 0.037 0.041
boot 0.022 0.023 0.032 0.040 boot 0.007 0.014 0.027 0.033
beta 0.044 0.030 0.035 0.043 beta 0.020 0.030 0.036 0.040
beta2 0.022 0.034 0.041 0.042
exchTest 0.028 0.035 0.031 0.040 exchTest 0.029 0.025 0.030 0.041
boot 0.007 0.015 0.023 0.038 boot 0.008 0.015 0.025 0.037
beta 0.033 0.038 0.039 0.045 beta 0.040 0.030 0.033 0.048
beta2 0.037 0.031 0.037 0.048
exchTest