"De copulis non est disputandum!" (Härdle and Okhrin, 2010). Indeed, the utilization of copulas for the construction and the mathematical analysis of multiple tests has meanwhile become an established technique in simultaneous statistical inference; see Dickhaus and Gierl (2013), Bodnar and Dickhaus (2014), Schmidt et al. (2014), Schmidt et al. (2015), Stange et al. (2015), Cerqueti and Lupi (2018), Neumann et al. (2019), and Sections 2.2.4 and 4.4 of Dickhaus (2014). In particular, Dickhaus and Gierl (2013) have explicitly shown that the copula approach leads to the most general construction method for single-step multiple tests under known univariate marginal null distributions of the test statistics. This is due to Sklar’s Theorem (see Sklar (1959)), which implies that any dependency structure among test statistics can be expressed by a copula. Thus, by standardizing the marginal null distributions (i.e., by transforming test statistics into -values), the copula carries the complete distributional information which is necessary for the calibration of a (multivariate) multiplicity-adjusted rejection threshold or local significance level, respectively.
One important class of copula functions is constituted by so-called Archimedean copulas (cf., e.g., Section 6 of Embrechts et al. (2003), Section 5.4 of McNeil et al. (2005), Section 3 of Cerqueti and Lupi (2018), and McNeil and Nešlehová (2009) for the relevance of Archimedean copulas in various applications). Such copulas are defined by a generator function . In the case that is a completely monotone function, the corresponding Archimedean copula possesses the property of being multivariate totally positive of order (MTP), see Müller and Scarsini (2005)
. This positive dependency property is very helpful for proving type I error control of a variety of multiple tests, including the extremely popular linear step-up test byBenjamini and Hochberg (1995) for control of the false discovery rate. Bodnar and Dickhaus (2014) have derived an adjustment factor for optimizing the power of the linear step-up test in the case that the copula of the -values is of Archimedean type.
In practice, the copula of test statistics or -values, respectively, is often an unknown nuisance parameter (of potentially infinite dimension). Therefore, it is near at hand to pre-estimate this copula and to utilize the estimate in the construction of the rejection rule of the multiple test. We call the resulting multiple test an empirically calibrated multiple test. In the context of Archimedean copulas, Stange et al. (2015) and Bodnar and Dickhaus (2014)
have investigated method-of-moments and maximum likelihood estimators, respectively. Their examples, however, are restricted to simple parametric copula families.Neumann et al. (2019) studied non-parametric Bernstein copula estimators with respect to an empirical calibration of multiple tests. Their methodology, however, is not specifically tailored towards Archimedean copula models.
In the present work, we therefore consider the problem of estimating the generator function of an Archimedean copula non-parametrically, and we analyze the impact of the estimation and its uncertainty on the performance of empirically calibrated multiple tests. The material is organized as follows. In sec: Section 2, estimation methods based on Kendall’s processes, as well as a novel modification thereof, are introduced. sec: Section 3 deals with multiple testing methodology based on such estimators. Simulation results are presented in sec: Simulations, and we provide some conclusions in sec: Conclusions.
2 Estimation of the Archimedean generator function using Kendall’s process
The application of Kendall’s distribution functions for statistical inference has been treated in quite some detail in Genest et al. (2011). In this section, we shortly summarize their findings and consider additionally a slightly modified generator estimator. Genest et al. (2011) propose an algorithm for estimating an Archimedean copula based on a slightly modified version of the estimated Kendall’s distribution function as defined by Barbe et al. (1996). In Barbe et al. (1996) the asymptotic properties of Kendall’s process are analyzed. The estimator of the Kendall’s distribution function is modified such that these properties still hold and the estimator itself is a Kendall’s distribution function. This avoids numerical issues and additionally, the continuous mapping theorem (CMT) can be applied. The resulting copula estimator is (weakly) consistent. We will see in the next section that this translates under some conditions directly to the realized family-wise error rate (FWER) in the case that the modified estimator is utilized in the empirical calibration of a multiple single-step test.
Further, Genest et al. (2011) discussed the identifiability of statistical models based on Kendall’s distribution function. In particular, the two- and three-dimensional cases are proven. In the two-dimensional case this property can be verified directly, i.e., the Archimedean generator can be directly calculated from Kendall’s distribution function. In the three-dimensional case they show that the model is identified by reducing the dimensionality in a sense. More precisely, they show that from Kendall’s distribution function follows , where , are Archimedean generator functions and is the dimension. Inspired by this argumentation, we consider additionally an estimator based on averaging over two-dimensional marginal data samples in our simulations, i.e.,
where is the generator estimator from Genest et al. (2011) and is the sample size.
This estimator can be theoretically motivated as follows. Let defined by , , be an -dimensional Archimedean copula, where the generator is a continuous, convex and strictly decreasing function with and is the generalized inverse of . For convenience, we will call both - and - the Archimedean generator and , . Let be a sequence of generators corresponding to given Kendall’s distribution functions which converge weakly111A distribution function converges weakly () if and only if it converges pointwise for all continuity points of the limit function. to . Genest et al. (2011) used their Proposition 6 in order to translate the convergence of the generators to the convergence of the Archimedean copulas. Let us restate this proposition:
Proposition (Proposition 6 in Genest et al. (2011)).
Let , , , … be -monotone222A generator is -monotone if and only if it has derivatives such that for all , and is decreasing and convex. Archimedean generators. Then , as , if and only if there exists a sequence of constants such that, for any , , as .
Assume that is only -monotone. Then is not an Archimedean copula anymore. However, the argumentation for the converse only uses the uniform convergence of and for any dimension . This means that the following proposition holds:
Let , , , … be -monotone Archimedean generators. If there exists a sequence of constants such that, for any , , as , then , as .
Applying Algorithm 1 from Genest et al. (2011) to two-dimensional marginal data results in a -monotone estimator (see Section 4.3 in Genest et al. (2011)). For a specific sequence this estimator converges uniformly to the true generator
on compact sets in probability (see Section 7 inGenest et al. (2011)). Thus, holds in probability (and therefore, uniformly in probability since is continuous and is a compact interval).
On the other hand, for a -monotone generator the Archimedean copula restricted to the diagonal line is still a one-dimensional distribution function. We will use this function in the next section for calibrating the multiple test under equal local significance levels. This means that we will receive valid (i.e., in ) local significance levels.. Therefore, the negative impact of using two-dimensional marginal data on multiple testing is potentially small. We will explore this further in the simulations of sec: Simulations.
3 Multiple testing
In this section, we apply the Archimedean copula estimator of Genest et al. (2011) in a multiple testing context. The consistency of the realized FWER is a direct consequence of the connection between the FWER and the copula of the test statistics (see Theorem 2 in Dickhaus and Gierl (2013)) and the consistency of the estimator (see Theorem 2 in Genest et al. (2011)).
We perform a multiple test for null hypotheses versus , . The parameter space is a subset of . The marginal test
rejects the null hypothesisif and only if the corresponding -value is smaller than a local significance level . Our goal is to find such that the FWER is controlled at the global significance level . A FWE occurs whenever we reject at least one true null hypothesis.
be the parameter vector of interest and. We assume that the following statements hold true.
The null hypothesis has the form , where .
This means that we can only test marginal parameters. In particular, the number of parameters is equal to the number of hypotheses . With a slight abuse of notation, denotes also .
We define the -value by , where is known.
The test statistic tends to larger values under the alternative .
This means that the -values tend to smaller values under alternatives.
The cdf of depends only on .
We test the marginal parameter by using only the -value . It makes sense that does not depend on the other parameters.
There exists a such that .
The important part here is that this equality holds for every point. This means that the parameters are independent of the observed test statistics.
The cdf is continuous.
Under this assumption, we have the following properties: The corresponding -value is valid, i.e.,
, and uniformly distributed under. We write or
for a random variableunder true null hypothesis or under true alternative , respectively. Further, we have almost surely that , , if and only if , where . In addition, the copula of the test statistics under is unique by Sklar’s theorem.
The parameter vector is a least favorable configuration (LFC).
An LFC is a parameter vector which maximizes the FWER. In our case the LFC is located in the global null hypothesis . This means that weak FWER control (i.e., control on ) entails strong FWER control (i.e., control on ). In our setting the subset pivotality condition (see Westfall and Young (1993)) is sufficient for this assumption (see Lemma 3.1 in Dickhaus and Stange (2013) and Lemma 3.3 in Neumann et al. (2019)).
The copula of the test statistics under the LFC is Archimedean.
Note that can depend on the whole parameter vector .
A pseudo sample of the test statistics under is at hand.
This sample is needed to calibrate the multiple test and can be (approximately) achieved by bootstrapping under (see Efron (1979)) for suitable test statistics.
Calibration and consistency
Under some assumptions, Dickhaus and Gierl (2013) have shown that the FWER is uniformly bounded on the parameter space by the copula of the random vector under the LFC . In our setup this is the same copula as the copula of the test statistics. More precisely, under our assumptions we have
The bound depends on the copula of the test statistics under the LFC . This has the advantage that we only need to estimate the Archimedean copula under one parameter vector in order to calibrate the multiple test. Setting , the estimated local significance level is given by
is the quantile of the estimated Archimedean copula on the diagonal line. For convenience, we will write occasionally instead of . Note that for and zero otherwise. This means that can be drawn in the graph of (see fig: alpha_loc).
The quantile function is uniformly consistent, i.e., it converges uniformly to in probability as . In particular, is consistent.
The argumentation is the same as for the consistency of in Genest et al. (2011). Let , , be non-decreasing functions with . Then (see Proposition 0.1 in Resnick (1987)). Again, denotes the quantile of . Thus, the function is continuous on the space of cdfs with a suitable norm for the weak convergence. By the CMT, is pointwise consistent on the compact set of continuity points and therefore, uniformly on . ∎
The realized FWER defined by converges in probability to as .
This result follows directly from lem: hat_C_T is uniformly consistent and the CMT applied to . ∎
In this section we consider three simulations. The first simulation compares pointwise the estimated copulas on the diagonal line. The second simulation is more of theoretic interest. There, we sample the test statistics under the LFC directly and continue as in the first simulation. We compare the estimated generators in terms of their calibrated local significance level and empirical power. In the last simulation we consider a misspecified setting by sampling normally distributed data. We use the sample means as test statistics and bootstrap to derive a pseudo sample for the test statistics under the LFC.
Simulation 1 - comparison of the generators
In this simulation a sample of the test statistics is directly taken from a Gumbel copula under various parameters derived from Kendall’s . For consistency with Simulation 3, we denote the sample size of the test statistics by instead of . We consider the estimators from Genest et al. (2011) and , which is defined by
where , is the dimension and is the number of Monte Carlo repetitions. In low dimensions it is possible to average over all two-dimensional subsamples as in sec: Section 2. We compare the estimated copulas regarding the pointwise absolute distance to the true copula , i.e., , .
The default setting is , and . In fig: Simulation 1-1 and fig: Simulation 1-2, we plot the resulting Archimedean copulas , where lies on the grid . At each point the copula is averaged over repetitions. It can be generally observed that the copula of both estimators are comparable with some exceptions. Under independence, GNZ performs considerably worse. In dimensions , the distance between MC and the true copula is slightly smaller and more consistent compared to GNZ. Further, it can be observed that the sample size should not be too small (). There is a slight but visible improvement from to . We will consider additional settings in the next two simulations.
Simulation 2 - calibration in an artificial setting
The setting in this simulation is very similar to Simulation 1. Again, no raw data is simulated. Instead, a sample of the test statistics from a Gumbel copula , , with standard normal marginals is given, where corresponds to the copula parameter under the LFC. We use this sample to calibrate the multiple test. This means that the estimation procedure from the first simulation is applied to . Then, we derive the estimated local significance level as in sec: Section 3.
We test the mean vector by the hypotheses versus . The observed test statistics are taken from a Gumbel copula , , with normal marginals. More specifically, and , where denotes the unknown global effect size (for simplicity) and emulates a effect of the data sample size. The copula parameter is directly derived from Kendall’s . We set , where is the proportion of true null hypotheses. The two-sided -value is given by .
In this setup, the copula of the test statistics depends indirectly on . In particular, we have under . Note that is an LFC since the Gumbel copula on the diagonal line is non-increasing for , where and . Therefore, it is straightforward to check that .
The default setting is , , and . In fig: Simulation 2-1 to fig: Simulation 2-3, the estimated local significance level and empirical power are compared, respectively. The empirical FWER is omitted but always below . With increasing sample size , both and converge to a local significance level close to the true significance level . With larger dimension , and fall drastically and are close to the Bonferroni correction. Under stronger dependence , and deviate again from the Bonferroni correction and get closer to
. In most settings, the GNZ estimator is slightly better in terms of the mean local significance level but slightly worse in terms of standard deviation (see the supplement for the standard deviation plots and numerical results). Overall , this simulation suggests that both estimators are comparable for calibrating multiple tests.
Simulation 3 - calibration in a misspecified setting
In this simulation the data is sampled from a multivariate normal distribution with equi-correlation coefficient
and known variances equal to one (i.e., the covariance matrix, where
is the identity matrix). We test the mean vectorby the hypotheses versus . The test statistic is given by and the two-sided -value is equal to . We have and , where denotes the global effect size. Then is the LFC since the Gaussian copula does not depend on and the LFC is marginally attained at zero. A sample of the test statistics is calculated by bootstrapping the data. Under the LFC, is equal to . We use the bootstrap sample of the test statistics to estimate the generators. The estimation procedure is then the same as in the first two simulations.
The default setting is , , and
. In fig: Simulation 3-1 and fig: Simulation 3-2,
the mean empirical power over tests resulting from the estimated
local significance level are compared. The empirical FWER is omitted
and (almost) always below . The only exceptions are due to numerical
issues. With increasing , the
tends to fail in this setup. This function is used for the estimator
. We skipped these cases
such that the results regarding GNZ in the right plot of fig: Simulation 3-1
are potentially averaged over less than tests and can contain
numerical errors. In most settings the empirical power is close to
the Bonferroni method. A notable improvement is achieved under large
equi-correlation . Overall, this simulation unfortunately suggests
that the Bonferroni correction is preferable when the assumption that
the test statistics copula is Archimedean is violated.
We considered two non-parametric Archimedean generator estimator in the context of multiple testing. The consistency of the modified estimator is established and can be observed in fig: Simulation 1-1 and fig: Simulation 2-1 as well.
The simulations suggest that both estimators perform comparable but the modified estimator is easier to implement and numerically more stable. Some improvements of this estimator which are visible when comparing both estimators directly do not carry over to a better performance in multiple testing. For example, the modified estimator performs considerably better under independence but the performance of both resulting multiple tests is almost identical to the Bonferroni test in this situation.
The modified estimator is still affected by the curse of dimensionality. However, we did not perform separate sample size calculations given the dimension and power. By Simulation 2, it is likely that the sample size needs to be much larger than the dimension for a performance comparable with the true generator. For example, in fig: Simulation 2-1 the dimension isand the sample size needs to be larger than .
We observed that the Archimedean assumption is essential unless strong dependencies are present. However, this might be only because the dependency is not utilized by the Bonferroni test. We did not perform comparisons with other copula estimation methods. Hence, we can not generally conclude that multiple tests should be performed using non-parametric Archimedean generator estimators under strong dependencies.
The source code, plots and numerical results of the simulations are available online.
Barbe et al. (1996)
Barbe, P., Genest, C., Ghoudi, K., and Rémillard, B. (1996).
On Kendall’s process.
Journal of Multivariate Analysis, 58(2):197–229.
- Benjamini and Hochberg (1995) Benjamini, Y. and Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society: Series B (Methodological), 57(1):289–300.
- Bodnar and Dickhaus (2014) Bodnar, T. and Dickhaus, T. (2014). False discovery rate control under Archimedean copula. Electronic Journal of Statistics, 8(2):2207–2241.
- Cerqueti and Lupi (2018) Cerqueti, R. and Lupi, C. (2018). Copulas, uncertainty, and false discovery rate control. International Journal of Approximate Reasoning, 100:105–114.
- Dickhaus (2014) Dickhaus, T. (2014). Simultaneous Statistical Inference with Applications in the Life Sciences. Springer-Verlag Berlin Heidelberg.
- Dickhaus and Gierl (2013) Dickhaus, T. and Gierl, J. (2013). Simultaneous test procedures in terms of p-value copulae. In Proceedings on the 2nd Annual International Conference on Computational Mathematics, Computational Geometry & Statistics (CMCGS 2013), pages 75–80. Global Science and Technology Forum (GSTF).
- Dickhaus and Stange (2013) Dickhaus, T. and Stange, J. (2013). Multiple point hypothesis test problems and effective numbers of tests for control of the family-wise error rate. Calcutta Statistical Association Bulletin, 65(257-260):123–144.
- Efron (1979) Efron, B. (1979). Bootstrap methods: another look at the jackknife. The Annals of Statistics, 7(1):1–26.
- Embrechts et al. (2003) Embrechts, P., Lindskog, F., and McNeil, A. (2003). Modelling Dependence with Copulas and Applications to Risk Management. In Rachev, S. T., editor, Handbook of Heavy Tailed Distributions in Finance, pages 329–384. Elsevier Science B.V.
- Genest et al. (2011) Genest, C., Nešlehová, J., and Ziegel, J. (2011). Inference in multivariate Archimedean copula models. TEST, 20(2):223–256.
- Härdle and Okhrin (2010) Härdle, W. K. and Okhrin, O. (2010). De copulis non est disputandum - Copulae: an overview. AStA Advances in Statistical Analysis, 94(1):1–31.
- McNeil et al. (2005) McNeil, A. J., Frey, R., and Embrechts, P. (2005). Quantitative Risk Management: Concepts, Techniques and Tools. Princeton, NJ: Princeton University Press.
- McNeil and Nešlehová (2009) McNeil, A. J. and Nešlehová, J. (2009). Multivariate Archimedean copulas, -monotone functions and -norm symmetric distributions. The Annals of Statistics, 37(5B):3059–3097.
- Müller and Scarsini (2005) Müller, A. and Scarsini, M. (2005). Archimedean copulae and positive dependence. Journal of Multivariate Analysis, 93(2):434–445.
- Neumann et al. (2019) Neumann, A., Bodnar, T., Pfeifer, D., and Dickhaus, T. (2019). Multivariate multiple test procedures based on nonparametric copula estimation. Biometrical Journal, 61(1):40–61.
- Resnick (1987) Resnick, S. I. (1987). Extreme Values, Regular Variation and Point Processes. Springer New York.
- Schmidt et al. (2015) Schmidt, R., Faldum, A., and Gerß, J. (2015). Adaptive designs with arbitrary dependence structure based on Fisher’s combination test. Statistical Methods & Applications, 24(3):427–447.
- Schmidt et al. (2014) Schmidt, R., Faldum, A., Witt, O., and Gerß, J. (2014). Adaptive designs with arbitrary dependence structure. Biometrical Journal, 56(1):86–106.
- Sklar (1959) Sklar, M. (1959). Fonctions de répartition à dimensions et leurs marges. Publications de l’Institut de statistique de l’Université de Paris, 8:229–231.
- Stange et al. (2015) Stange, J., Bodnar, T., and Dickhaus, T. (2015). Uncertainty quantification for the family-wise error rate in multivariate copula models. AStA Advances in Statistical Analysis, 99(3):281–310.
- Westfall and Young (1993) Westfall, P. H. and Young, S. S. (1993). Resampling-based multiple testing: examples and methods for p-value adjustment. Wiley Series in Probability and Mathematical Statistics. Applied Probability and Statistics. Wiley, New York.