Let ,, of is continuous. By Sklar’s theorem Sklar59 , there exists a unique copula, , such that
where is the th marginal distribution function of . In fact, in the continuous case, we have for , where is the generalized inverse of a distribution function . The empirical copula Deheu79 is defined by
where, for ,
For and , let be the rank of among ; namely,
Frequently used is a rank-based version of the empirical copula given by
In the absence of ties, we have
Both functions and are piecewise constant and cannot be genuine copulas. When the sample size is small, they suffer from the presence of ties when used in resampling.
The empirical beta copula SST2017 is a simple but effective way of correcting and smoothing the empirical copula. Its definition will be given in Section 3. Even though its asymptotic distribution is the same as the one of the usual empirical copula, its accuracy in small samples is usually better, among others because it is itself always a genuine copula. Moreover, drawing random samples from the empirical beta copula is quite straightforward.
Because of these properties, it is reasonable to expect that simple and accurate resampling schemes for the empirical copula process can be constructed based on the empirical beta copula. For tail copulas, a simulation study in Kiriliouk-Segers-Tafakori2018 showed that the bootstrap based on the empirical beta copula worked significantly better than the direct multiplier bootstrap from Buech-Dette2010 . The purpose of this paper is to investigate further both the finite-sample and asymptotic behavior of this resampling method, but then for general copulas.
The paper is structured as follows. In Section 2, we review and discuss the literature on resampling methods for the empirical copula process. The asymptotic properties of two resampling procedures based on the empirical beta copula are investigated in Section 3. In Section 4, extensive simulation studies are conducted to demonstrate the effectiveness of resampling procedures based on the empirical beta copula for constructing confidence intervals for several copula functionals and for testing shape constraints on the copula. We conclude the paper with some discussion and open questions in Section 5. All proofs are grouped together in the Appendix.
2 Review on bootstrapping empirical copula processes
In this section, we give a short review on bootstrapping empirical copula processes, incorporating some newer improvements. We limit ourselves to i.i.d. sequences and note that extensions to stationary time series have been considered in Buech-Volg2013 , among others.
First we recall a basic result on the weak convergence of the empirical copula process. Let be the Banach space of real-valued, bounded functions on , equipped with the supremum norm . The arrow denotes weak convergence in the sense used in Vaart-Wellner . The following condition is the only one needed for our convergence results.
For each , the copula has a continuous first-order partial derivative on the set .
The following theorem is proved in Segers2012 . Let denote a -pinned Brownian sheet, i.e., a centered Gaussian process on with continuous trajectories and covariance function
Suppose Condition 2.1 holds. Then we have
in , where
with appearing at the -th coordinate.
Next we introduce notation for the convergence of conditional laws in probability given the data as defined inKosorok2008 ; see also (Vaart-Wellner, , Section 2.9). Let
If is a sequence of bootstrapped processes in with random weights , then the notation
Here the notation indicates conditional expectation over the weights given the data , and and denote the minimal measurable majorant and maximal measurable minorant, respectively, with respect to the joint data .
In the sequel, the random weights
can signify different things: a multinomial random vector when drawing from the data with replacement, independent and identically distributed multipliers in the multiplier bootstrap, or vectors of order statistics from the uniform distribution when resampling from the empirical beta copula. In (6), the symbol will then be changed accordingly.
2.1 Straightforward bootstrap
Let be a multinomial random vector with probabilities , independent of the sample . Set
We can also define the bootstrapped version of the rank-based empirical copula
Since a bootstrap sample will have ties with a (large) positive probability, the bound (3) is no longer valid for and . But we can prove the following.
The proof of Proposition 2.3 is given in the Appendix. Convergence in probability of the conditional laws
in the space was shown in Fer-Rad-Weg04 under the condition that all partial derivatives exist and are continuous on and in Buech-Volg2013 under the weaker Condition 2.1. Because of (3) and Proposition 2.3, it also holds that
2.2 Multiplier bootstrap with estimated partial derivatives
The multiplier bootstrap for the empirical copula proposed by Remillard-Scaillet2009 has proved useful for many problems. In Buech-Dette2010 it was found to have a better finite-sample performance than other resampling methods for the empirical copula process. We present a modified version given by Buech-Dette2010 that we employ for the simulation studies in Section 4.
be independent and identically distributed non-negative random variables, independent of the data, with, and . Put , and set
Define and . Using Theorem 2.6 in Kosorok2008 and the a.s. convergence , where is the identity function on , we can show that
Hence if is the estimate for , applying finite differencing to the empirical copula at a spacing proportional to , then the processes
give conditional approximations of . Namely, we have
3 Resampling with the empirical beta copula
The empirical beta copula SST2017 is defined as
where denote the ranks as in (1) and where, for and ,
is the cumulative distribution function of the beta distribution. In this section, we examine the asymptotic properties of two resampling procedures based on the empirical beta copula.
3.1 Standard bootstrap for the empirical beta copula
Let be a multinomial random vector with success probabilities , independent of the original sample. Set
where are the bootstrapped ranks in (9). Let , for , be independent binomial random variables. Let denote expectation with respect to , conditionally on the sample and the multinomial random vector. It follows that
where is the bootstrapped rank-based empirical copula in (8). Similarly, the empirical beta copula is
From the weak convergence of the bootstrapped process , we will prove the following proposition. As a consequence, consistency of the bootstrapped process of the (rank-based) empirical copula in (11) entails consistency of the one for the empirical beta copula.
Under Condition 2.1, we have
and thus as .
3.2 Bootstrap by drawing samples from the empirical beta copula
The original motivation of SST2017 was resampling; the uniform random variables generated independently and rearranged in the order specified by the componentwise ranks of the original sample might in some sence be considered as a bootstrap sample. Although this idea turned out to be not entirely correct, it was still how the empirical beta copula was discovered originally. In the same spirit, it is natural to study the bootstrap method based on drawing samples from the empirical beta copula .
It is in fact very simple to generate a random variate from .
Given the ranks , of the original sample:
Generate from the discrete uniform distribution on .
Generate independently , .
Repeating the above algorithm times independently, we get a sample of independent random vectors drawn from , conditionally on the data . Let this sample be denoted by , . We can think of this procedure as a kind of smoothed bootstrap (see Efron1982siam , (Shao-Tu95, , Section 3.5)) because the empirical beta copula may be thought of as a smoothed version of the empirical copula.
The joint and marginal empirical distribution functions of the bootstrap sample are
The ranks of the bootstrap sample are given by
These yield bootstrapped versions of the Deheuvels empirical copula, the rank-based empirical copula and the empirical beta copula:
3.3 Approximating sampling distributions of rank statistics by resampling from the empirical beta copula
Statistical inference for often involves rank statistics. One way to justify this is to appeal to the invariance of under coordinatewise continuous strictly increasing transformations. Let us hence consider a rank statistic , where is a vector consisting of the coordinatewise ranks of . Below we suggest a way of approximating its distribution by drawing a sample from and computing “bootstrap replicates”. This also avoids problems with ties encountered when drawing with replacement from the original data. Specifically, our procedure goes as follows.
Algorithm 3.4 (Smoothed beta bootstrap)
The validity of this procedure follows from our claim in the preceding subsection. Because all the related empirical copula processes are asymptotically equivalent, we need to look into the small-sample performance of the methods. In Subsection 4.2, we construct confidence intervals for some copula functionals by popular rank statistics.
4 Simulation Studies
We assess the performance of the bootstrap methods presented in Sections 2 and 3 in a wide range of applications. In all of the experiments below, the number of Monte Carlo runs and the number of bootstrap replications are both set to . The nominal confidence level is always 0.95 and we use Clayton, Gumbel-Hougaard, Frank and Gauss copula families, see e.g. Nelsen2006 . Most simulations are done in R with the package copula copulaR , except for Subsection 4.2, where MATLAB code was used.
4.1 Covariance of the limiting process
We compare the estimated covariances of the limiting process based on the standard and smoothed beta bootstrap methods with the partial derivatives multiplier method, which in Buech-Dette2010 is shown to perform better than the straightforward bootstrap or the direct multiplier method. We follow the set-up in Buech-Dette2010 , evaluating the covariance at four points for in the unit square. The variables are such that for . For the bivariate Clayton copula with parameter , Table 1 shows the mean squared error of the estimated covariance based on the partial derivative multiplier method , the standard beta bootstrap and the smoothed beta bootstrap for and . Results for have been copied from Tables 3 and 4 in Buech-Dette2010 . Both methods based on the empirical beta copula outperform the multiplier method in all points but and ).
4.2 Confidence intervals for rank correlation coefficients
We assess the performance of the straightforward bootstrap and the smoothed beta bootstrap (Subsections 2.1 and 3.3) for constructing confidence intervals for two popular rank correlation coefficients for bivariate distributions, Kendall’s and Spearman’s , which are known to depend only on the copula associated with .
The population Kendall’s is defined by
In terms of
the sample Kendall’s is given by
. Its asymptotic variance may be estimated by
where , and (see Hol-Wol-Chi2014 ). An asymptotic confidence interval for is thus given by , with
the usual standard normal tail quantile.
This interval can be compared to the confidence intervals obtained by our resampling methods. Table 2 shows the coverage probabilities and the average lengths of the estimated confidence intervals based on the asymptotic distribution, the straightforward bootstrap and the smoothed beta bootstrap for the independence copula () and the Clayton copula with () and (. The smoothed beta bootstrap gives the most conservative coverage probabilities, but the shortest length among the three.
The population Spearman’s and the sample Spearman’s rho are given by
The limiting distribution of equals that of , so it is possible in principle to construct confidence intervals based on this asymptotics. However, unlike the case of , it is cumbersome and involves the partial derivatives of , which must be estimated, so we omit it from our study here. In Table 3, one can see that the coverage probabilities are more conservative for the smoothed beta bootstrap than for the straightforward bootstrap, but the average lengths of the estimated confidence intervals are very similar for both methods. This could be due to the fact that , as can be directly computed.
4.3 Confidence intervals for a copula parameter
Suppose that the copula of is parametrized by , so that . When the ’s are unknown, the resulting problem of estimating is semiparametric and is studied in Gen-Gho-Riv95 ; Tsuka05 . Assume that is absolutely continuous with density , which is differentiable with respect to . Replacing the unknown ’s in the score equation by their (rescaled) empirical counterparts, one gets the estimating equation
where . The solution to (16) is called the pseudo-likelihood estimator.
We compare confidence intervals for when estimated by the pseudo-likelihood estimator based on the asymptotic variance given in Gen-Gho-Riv95 , the straightforward bootstrap, the smoothed beta bootstrap and the classic parametric bootstrap. Tables 4 and 5 show the estimated coverage probabilities and average interval lengths of the confidence intervals for the Clayton, Gauss, Frank and Gumbel–Hougaard copula families. For the Clayton copula, the smoothed beta bootstrap gives the shortest intervals both for and , but only for the coverage probabilities are too liberal, which is somewhat puzzling. For the Frank and Gumbel–Hougaard copulas, the smoothed beta bootstrap gives the most conservative coverage probabilities, but the shortest length among the four. For the Gauss copula, the asymptotic approximation gives significantly smaller coverage probabilities than the nominal value 0.95.
4.4 Testing symmetry of a copula
For a bivariate copula , consider the problem of testing the symmetry hypothesis for all
. We focus on two test statistics proposed inGen-Nes-Que2012 ,
and also include a version of based on the empirical beta copula, i.e.,
Similarly as in Proposition 1 in Gen-Nes-Que2012 , the statistic can be computed via
with for and as in (12). For fixed , the matrix can be precomputed and stored, reducing the computation time for the resampling methods. A similar modification of into is obviously possible as well, but would be computationally more demanding.
In order to compute -values, we need to generate bootstrap samples from a distribution fulfilling the restriction specified by . A natural candidate is a ‘symmetrized’ version of the empirical beta copula
When resampling, this simply amounts to interchanging the two coordinates at random in step 3 of Algorithm 3.2. We employ the following three resampling schemes for comparison of actual sizes of the tests.
The symmetrized smoothed beta bootstrap: we resample from to get bootstrap replicates of , , and ;
The symmetrized version of the straightforward bootstrap for and ;
Tables 6 and 7 show the actual sizes of the symmetry tests for the Clayton and Gauss copulas. On the whole, the smoothed beta bootstrap works better than exchTest or equally well both for and , except when dependence is strong () and the sample size is small (), although no method produces a satisfying result in this case. The smoothed beta bootstrap with produces actual sizes similar to those with . The statistic performs slightly better than on average, especially for strong positive dependence. The straightforward bootstrap performs poorly in all cases, which is as expected Remillard-Scaillet2009 .
To compare the power of the tests, the Clayton and Gauss copulas are made asymmetric by Khoudraji’s device Khoudraji1995 , that is, the asymmetric version of a copula is defined as
Table 8 shows the empirical power of and for for the three resampling methods. We see that the smoothed beta bootstraps with and have higher power than exchTest for almost all sample sizes and parameter values considered, and among them, the smoothed beta bootstrap with has a slightly higher power in almost all cases.