# On Some Resampling Procedures with the Empirical Beta Copula

The empirical beta copula is a simple but effective smoother of the empirical copula. Because it is a genuine copula, from which, moreover, it is particularly easy to sample, it is reasonable to expect that resampling procedures based on the empirical beta copula are expedient and accurate. In this paper, after reviewing the literature on some bootstrap approximations for the empirical copula process, we first show the asymptotic equivalence of several bootstrapped processes related to the empirical copula and empirical beta copula. Then we investigate the finite-sample properties of resampling schemes based on the empirical (beta) copula by Monte Carlo simulation. More specifically, we consider interval estimation for some functionals such as rank correlation coefficients and dependence parameters of several well-known families of copulas, constructing confidence intervals by several methods and comparing their accuracy and efficiency. We also compute the actual size and power of symmetry tests based on several resampling schemes for the empirical copula and empirical beta copula.

## Authors

• 2 publications
• 14 publications
• 1 publication
12/31/2014

### The continuum-of-urns scheme, generalized beta and Indian buffet processes, and hierarchies thereof

We describe the combinatorial stochastic process underlying a sequence o...
09/02/2011

### The Stick-Breaking Construction of the Beta Process as a Poisson Process

We show that the stick-breaking construction of the beta process due to ...
08/31/2020

### Precision for binary measurement methods and results under beta-binomial distributions

To handle typical problems from fields dealing with biological responses...
11/09/2020

### Time-Invariance Coefficients Tests with the Adaptive Multi-Factor Model

The purpose of this paper is to test the multi-factor beta model implied...
06/20/2021

### Some smooth sequential empirical copula processes and their multiplier bootstraps under strong mixing

A broad class of smooth empirical copulas that contains the empirical be...
11/30/2020

### The statistical properties of RCTs and a proposal for shrinkage

We abstract the concept of a randomized controlled trial (RCT) as a trip...
02/23/2018

### On detecting changes in the jumps of arbitrary size of a time-continuous stochastic process

This paper introduces test and estimation procedures for abrupt and grad...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Let ,

, be independent and identically distributed random vectors, and assume that the cumulative distribution function,

, of is continuous. By Sklar’s theorem Sklar59 , there exists a unique copula, , such that

 F(x1,…,xd)=C(F1(x1),…,Fd(xd)),

where is the th marginal distribution function of . In fact, in the continuous case, we have for , where is the generalized inverse of a distribution function . The empirical copula Deheu79 is defined by

 Cn(u):=Fn(F−n1(u1),…,F−nd(ud)),

where, for ,

 Fn(x):=1nn∑i=11{Xi1⩽x1,…,Xid⩽xd},Fnj(xj):=1nn∑i=11{Xij⩽xj}.

For and , let be the rank of among ; namely,

 Rij,n=n∑k=11{Xkj⩽Xij}. (1)

Frequently used is a rank-based version of the empirical copula given by

 ~Cn(u):=1nn∑i=1d∏j=11{Rij,nn⩽uj}. (2)

In the absence of ties, we have

 ∥~Cn−Cn∥∞:=supu∈[0,1]d|~Cn(u)−Cn(u)|⩽dn. (3)

Both functions and are piecewise constant and cannot be genuine copulas. When the sample size is small, they suffer from the presence of ties when used in resampling.

The empirical beta copula SST2017 is a simple but effective way of correcting and smoothing the empirical copula. Its definition will be given in Section 3. Even though its asymptotic distribution is the same as the one of the usual empirical copula, its accuracy in small samples is usually better, among others because it is itself always a genuine copula. Moreover, drawing random samples from the empirical beta copula is quite straightforward.

Because of these properties, it is reasonable to expect that simple and accurate resampling schemes for the empirical copula process can be constructed based on the empirical beta copula. For tail copulas, a simulation study in Kiriliouk-Segers-Tafakori2018 showed that the bootstrap based on the empirical beta copula worked significantly better than the direct multiplier bootstrap from Buech-Dette2010 . The purpose of this paper is to investigate further both the finite-sample and asymptotic behavior of this resampling method, but then for general copulas.

The paper is structured as follows. In Section 2, we review and discuss the literature on resampling methods for the empirical copula process. The asymptotic properties of two resampling procedures based on the empirical beta copula are investigated in Section 3. In Section 4, extensive simulation studies are conducted to demonstrate the effectiveness of resampling procedures based on the empirical beta copula for constructing confidence intervals for several copula functionals and for testing shape constraints on the copula. We conclude the paper with some discussion and open questions in Section 5. All proofs are grouped together in the Appendix.

## 2 Review on bootstrapping empirical copula processes

In this section, we give a short review on bootstrapping empirical copula processes, incorporating some newer improvements. We limit ourselves to i.i.d. sequences and note that extensions to stationary time series have been considered in Buech-Volg2013 , among others.

First we recall a basic result on the weak convergence of the empirical copula process. Let be the Banach space of real-valued, bounded functions on , equipped with the supremum norm . The arrow denotes weak convergence in the sense used in Vaart-Wellner . The following condition is the only one needed for our convergence results.

###### Condition 2.1

For each , the copula has a continuous first-order partial derivative on the set .

The following theorem is proved in Segers2012 . Let denote a -pinned Brownian sheet, i.e., a centered Gaussian process on with continuous trajectories and covariance function

 (4)
###### Theorem 2.2

Suppose Condition 2.1 holds. Then we have

 Gn:=√n(Cn−C)⇝GC,n→∞

in , where

 GC(u):=UC(u)−d∑j=1˙Cj(u)UC(1,uj,1)

with appearing at the -th coordinate.

Next we introduce notation for the convergence of conditional laws in probability given the data as defined in

 BL1:={h:ℓ∞([0,1]d)→R∣∥h∥∞⩽1 and |h(x)−h(y)|⩽∥x−y∥∞for all x,y∈ℓ∞([0,1]d)}. (5)

If is a sequence of bootstrapped processes in with random weights , then the notation

 ^XnP⇝WX,n→∞ (6)

means that

 suph∈BL1|EW[h(^Xn)]−E[h(X)]|⟶0in outer probability,EW[h(^Xn)∗]−EW[h(^Xn)∗]P⟶0for all h∈BL1.⎫⎪⎬⎪⎭ (7)

Here the notation indicates conditional expectation over the weights given the data , and and denote the minimal measurable majorant and maximal measurable minorant, respectively, with respect to the joint data .

In the sequel, the random weights

can signify different things: a multinomial random vector when drawing from the data with replacement, independent and identically distributed multipliers in the multiplier bootstrap, or vectors of order statistics from the uniform distribution when resampling from the empirical beta copula. In (

6), the symbol will then be changed accordingly.

### 2.1 Straightforward bootstrap

Let be a multinomial random vector with probabilities , independent of the sample . Set

 C∗n(u)=F∗n(F∗−n1(u1),…,F∗−nd(ud)),

where

 F∗n(x) :=1nn∑i=1Wnid∏j=11{Xij⩽xj}, F∗nj(xj) :=1nn∑i=1Wni1{Xij⩽xj},j∈{1,…,d}.

We can also define the bootstrapped version of the rank-based empirical copula

 ~C∗n(u)=1nn∑i=1Wnid∏j=11{R∗ij,nn⩽uj}, (8)

where

 R∗ij,n=n∑k=1Wnk1{Xkj⩽Xij}. (9)

Since a bootstrap sample will have ties with a (large) positive probability, the bound (3) is no longer valid for and . But we can prove the following.

###### Proposition 2.3

We have

 (10)

The proof of Proposition 2.3 is given in the Appendix. Convergence in probability of the conditional laws

 √n(C∗n−Cn)PW⇝GC,n→∞

in the space was shown in Fer-Rad-Weg04 under the condition that all partial derivatives exist and are continuous on and in Buech-Volg2013 under the weaker Condition 2.1. Because of (3) and Proposition 2.3, it also holds that

 ~αn:=√n(~C∗n−~Cn)P⇝WGC,n→∞. (11)

### 2.2 Multiplier bootstrap with estimated partial derivatives

The multiplier bootstrap for the empirical copula proposed by Remillard-Scaillet2009 has proved useful for many problems. In Buech-Dette2010 it was found to have a better finite-sample performance than other resampling methods for the empirical copula process. We present a modified version given by Buech-Dette2010 that we employ for the simulation studies in Section 4.

Let

be independent and identically distributed non-negative random variables, independent of the data, with

, and . Put , and set

 C∘n(u) :=1nn∑i=1ξi¯¯¯ξnd∏j=11{Xij⩽F−nj(uj)}, ~C∘n(u) :=1nn∑i=1ξi¯¯¯ξnd∏j=11{Fnj(Xij)⩽uj}.

Define and . Using Theorem 2.6 in Kosorok2008 and the a.s. convergence , where is the identity function on , we can show that

 β∘nP⇝ξUCand~β∘nP⇝ξUC,n→∞.

Hence if is the estimate for , applying finite differencing to the empirical copula at a spacing proportional to , then the processes

 ⎧⎪⎨⎪⎩αpdm∘n(u):=β∘n(u)−∑dj=1^˙Cj(u)β∘n(1,uj,1)~αpdm∘n(u):=~β∘n(u)−∑dj=1^˙Cj(u)~β∘n(1,uj,1)

give conditional approximations of . Namely, we have

 αpdm∘nP⇝ξGCand~αpdm∘nP⇝ξGC,n→∞.

## 3 Resampling with the empirical beta copula

The empirical beta copula SST2017 is defined as

 Cβn(u)=1nn∑i=1d∏j=1Fn,Rij,n(uj),u∈[0,1]d,

where denote the ranks as in (1) and where, for and ,

 Fn,r(u)=n∑s=r(ns)us(1−u)n−s (12)

is the cumulative distribution function of the beta distribution

. In this section, we examine the asymptotic properties of two resampling procedures based on the empirical beta copula.

### 3.1 Standard bootstrap for the empirical beta copula

Let be a multinomial random vector with success probabilities , independent of the original sample. Set

 Cβ∗n(u)=1nn∑i=1Wnid∏j=1Fn,R∗ij,n(uj),

where are the bootstrapped ranks in (9). Let , for , be independent binomial random variables. Let denote expectation with respect to , conditionally on the sample and the multinomial random vector. It follows that

 Cβ∗n(u) =1nn∑i=1Wnid∏j=1ES[1{Sjn⩾R∗ij,nn}]=ES[~C∗n(S1/n,…,Sd/n)],

where is the bootstrapped rank-based empirical copula in (8). Similarly, the empirical beta copula is

 Cβn(u)=1nn∑i=1d∏j=1Fn,Rij,n(uj)=ES[~Cn(S1/n,…,Sd/n)],

where is the rank-based empirical copula in (2). Consider the bootstrapped processes defined in (11) and . We find

 αβn(u)=ES[~αn(S1/n,…,Sd/n)]. (13)

From the weak convergence of the bootstrapped process , we will prove the following proposition. As a consequence, consistency of the bootstrapped process of the (rank-based) empirical copula in (11) entails consistency of the one for the empirical beta copula.

###### Proposition 3.1

Under Condition 2.1, we have

 supu∈[0,1]d|αβn(u)−~αn(u)|=op(1),n→∞, (14)

and thus as .

### 3.2 Bootstrap by drawing samples from the empirical beta copula

The original motivation of SST2017 was resampling; the uniform random variables generated independently and rearranged in the order specified by the componentwise ranks of the original sample might in some sence be considered as a bootstrap sample. Although this idea turned out to be not entirely correct, it was still how the empirical beta copula was discovered originally. In the same spirit, it is natural to study the bootstrap method based on drawing samples from the empirical beta copula .

It is in fact very simple to generate a random variate from .

###### Algorithm 3.2

Given the ranks , of the original sample:

1. Generate from the discrete uniform distribution on .

2. Generate independently ,  .

3. Set .

Repeating the above algorithm times independently, we get a sample of independent random vectors drawn from , conditionally on the data . Let this sample be denoted by , . We can think of this procedure as a kind of smoothed bootstrap (see Efron1982siam , (Shao-Tu95, , Section 3.5)) because the empirical beta copula may be thought of as a smoothed version of the empirical copula.

The joint and marginal empirical distribution functions of the bootstrap sample are

 G#n(u)=1nn∑i=1d∏j=11{V#ij⩽uj},G#nj(uj)=1nn∑i=11{V#ij⩽uj}.

The ranks of the bootstrap sample are given by

 R#ij,n=nG#nj(V#ij)=n∑k=11{V#kj⩽V#ij}. (15)

These yield bootstrapped versions of the Deheuvels empirical copula, the rank-based empirical copula and the empirical beta copula:

 C#n(u) :=G#n(G#−n1(u1),…,G#−nd(ud)),~C#n(u):=1nn∑i=1d∏j=11{R#ij,n/n⩽uj}, Cβ#n(u) :=1nn∑i=1d∏j=1Fn,R#ij,n(uj).
###### Proposition 3.3

Assume Condition 2.1. Then as , we have conditional weak convergence in probability as defined in (6) with respect to the random vectors of the bootstrapped empirical copula processes

 α#n:=√n(C#n−Cn),~α#n:=√n(~C#n−~Cn),αβ#n:=√n(Cβ#n−Cβn),

to the limit process defined in Theorem 2.2.

### 3.3 Approximating sampling distributions of rank statistics by resampling from the empirical beta copula

Statistical inference for often involves rank statistics. One way to justify this is to appeal to the invariance of under coordinatewise continuous strictly increasing transformations. Let us hence consider a rank statistic , where is a vector consisting of the coordinatewise ranks of . Below we suggest a way of approximating its distribution by drawing a sample from and computing “bootstrap replicates”. This also avoids problems with ties encountered when drawing with replacement from the original data. Specifically, our procedure goes as follows.

###### Algorithm 3.4 (Smoothed beta bootstrap)

Given :

1. Apply Algorithm 3.2 times independently to obtain a bootstrap sample drawn from , compute their ranks as in (15) and put .

2. Repeat Step 1 a moderate to large number, , of times to get bootstrap replicates .

3. Use to approximate the sampling distribution of .

The validity of this procedure follows from our claim in the preceding subsection. Because all the related empirical copula processes are asymptotically equivalent, we need to look into the small-sample performance of the methods. In Subsection 4.2, we construct confidence intervals for some copula functionals by popular rank statistics.

## 4 Simulation Studies

We assess the performance of the bootstrap methods presented in Sections 2 and 3 in a wide range of applications. In all of the experiments below, the number of Monte Carlo runs and the number of bootstrap replications are both set to . The nominal confidence level is always 0.95 and we use Clayton, Gumbel-Hougaard, Frank and Gauss copula families, see e.g. Nelsen2006 . Most simulations are done in R with the package copula copulaR , except for Subsection 4.2, where MATLAB code was used.

### 4.1 Covariance of the limiting process

We compare the estimated covariances of the limiting process based on the standard and smoothed beta bootstrap methods with the partial derivatives multiplier method, which in Buech-Dette2010 is shown to perform better than the straightforward bootstrap or the direct multiplier method. We follow the set-up in Buech-Dette2010 , evaluating the covariance at four points for in the unit square. The variables are such that for . For the bivariate Clayton copula with parameter , Table 1 shows the mean squared error of the estimated covariance based on the partial derivative multiplier method , the standard beta bootstrap and the smoothed beta bootstrap for and . Results for have been copied from Tables 3 and 4 in Buech-Dette2010 . Both methods based on the empirical beta copula outperform the multiplier method in all points but and ).

### 4.2 Confidence intervals for rank correlation coefficients

We assess the performance of the straightforward bootstrap and the smoothed beta bootstrap (Subsections 2.1 and 3.3) for constructing confidence intervals for two popular rank correlation coefficients for bivariate distributions, Kendall’s and Spearman’s , which are known to depend only on the copula associated with .

The population Kendall’s is defined by

 τ(C):=4∫10∫10C(u1,u2)dC(u1,u2)−1.

In terms of

 Qk,i:=sign[(Xk,1−Xi,1)(Xk,2−Xi,2)],K:=n−1∑i=1n∑k=i+1Qk,i,

the sample Kendall’s is given by

. Its asymptotic variance may be estimated by

 ^σ2τ:=2n(n−1)[2(n−2)n(n−1)2n∑i=1(Ci−¯¯¯¯C)2+1−^τ2],

where , and (see Hol-Wol-Chi2014 ). An asymptotic confidence interval for is thus given by , with

the usual standard normal tail quantile.

This interval can be compared to the confidence intervals obtained by our resampling methods. Table 2 shows the coverage probabilities and the average lengths of the estimated confidence intervals based on the asymptotic distribution, the straightforward bootstrap and the smoothed beta bootstrap for the independence copula () and the Clayton copula with () and (. The smoothed beta bootstrap gives the most conservative coverage probabilities, but the shortest length among the three.

The population Spearman’s and the sample Spearman’s rho are given by

 ρ(C) :=12∫10∫10[C(u1,u2)−u1u2]du1du2, ^ρ :=12n(n2−1)n∑i=1(Ri1,n−n+12)(Ri2,n−n+12).

The limiting distribution of equals that of , so it is possible in principle to construct confidence intervals based on this asymptotics. However, unlike the case of , it is cumbersome and involves the partial derivatives of , which must be estimated, so we omit it from our study here. In Table 3, one can see that the coverage probabilities are more conservative for the smoothed beta bootstrap than for the straightforward bootstrap, but the average lengths of the estimated confidence intervals are very similar for both methods. This could be due to the fact that , as can be directly computed.

### 4.3 Confidence intervals for a copula parameter

Suppose that the copula of is parametrized by , so that . When the ’s are unknown, the resulting problem of estimating is semiparametric and is studied in Gen-Gho-Riv95 ; Tsuka05 . Assume that is absolutely continuous with density , which is differentiable with respect to . Replacing the unknown ’s in the score equation by their (rescaled) empirical counterparts, one gets the estimating equation

 n∑k=1˙cθ[Fn1(Xk,1),Fn2(Xk,2)]cθ[Fn1(Xk,1),Fn2(Xk,2)]=0, (16)

where . The solution to (16) is called the pseudo-likelihood estimator.

We compare confidence intervals for when estimated by the pseudo-likelihood estimator based on the asymptotic variance given in Gen-Gho-Riv95 , the straightforward bootstrap, the smoothed beta bootstrap and the classic parametric bootstrap. Tables 4 and 5 show the estimated coverage probabilities and average interval lengths of the confidence intervals for the Clayton, Gauss, Frank and Gumbel–Hougaard copula families. For the Clayton copula, the smoothed beta bootstrap gives the shortest intervals both for and , but only for the coverage probabilities are too liberal, which is somewhat puzzling. For the Frank and Gumbel–Hougaard copulas, the smoothed beta bootstrap gives the most conservative coverage probabilities, but the shortest length among the four. For the Gauss copula, the asymptotic approximation gives significantly smaller coverage probabilities than the nominal value 0.95.

### 4.4 Testing symmetry of a copula

For a bivariate copula , consider the problem of testing the symmetry hypothesis for all

. We focus on two test statistics proposed in

Gen-Nes-Que2012 ,

 Sn =∫[0,1]2[Cn(u1,u2)−Cn(u2,u1)]2dCn(u1,u2), Rn =∫[0,1]2[Cn(u1,u2)−Cn(u2,u1)]2du1du2,

and also include a version of based on the empirical beta copula, i.e.,

 Rβn=∫[0,1]2[Cβn(u1,u2)−Cβn(u2,u1)]2du1du2.

Similarly as in Proposition 1 in Gen-Nes-Que2012 , the statistic can be computed via

 Rβn=2n2n∑i=1n∑j=1{Bn(Ri1,n,Rj1,n)Bn(Ri2,n,Rj2,n)−Bn(Ri1,n,Rj2,n)Bn(Ri2,n,Rj1,n)}

with for and as in (12). For fixed , the matrix can be precomputed and stored, reducing the computation time for the resampling methods. A similar modification of into is obviously possible as well, but would be computationally more demanding.

In order to compute -values, we need to generate bootstrap samples from a distribution fulfilling the restriction specified by . A natural candidate is a ‘symmetrized’ version of the empirical beta copula

 Cβ,symn(u1,u2):=12Cβn(u1,u2)+12Cβn(u2,u1).

When resampling, this simply amounts to interchanging the two coordinates at random in step 3 of Algorithm 3.2. We employ the following three resampling schemes for comparison of actual sizes of the tests.

• The symmetrized smoothed beta bootstrap: we resample from to get bootstrap replicates of , , and ;

• The symmetrized version of the straightforward bootstrap for and ;

• exchTest in the R package copula copulaR , which implements the multiplier bootstrap for and as described in Gen-Nes-Que2012 and in Section 5 of Kojadinovic-Yan2012 . For , the grid length in exchTest is set to .

Tables 6 and 7 show the actual sizes of the symmetry tests for the Clayton and Gauss copulas. On the whole, the smoothed beta bootstrap works better than exchTest or equally well both for and , except when dependence is strong () and the sample size is small (), although no method produces a satisfying result in this case. The smoothed beta bootstrap with produces actual sizes similar to those with . The statistic performs slightly better than on average, especially for strong positive dependence. The straightforward bootstrap performs poorly in all cases, which is as expected Remillard-Scaillet2009 .

To compare the power of the tests, the Clayton and Gauss copulas are made asymmetric by Khoudraji’s device Khoudraji1995 , that is, the asymmetric version of a copula is defined as

 Kδ(u1,u2)=uδ1C(u1−δ1,u2),(u1,u2)∈[0,1]2.

Table 8 shows the empirical power of and for for the three resampling methods. We see that the smoothed beta bootstraps with and have higher power than exchTest for almost all sample sizes and parameter values considered, and among them, the smoothed beta bootstrap with has a slightly higher power in almost all cases.