1 Introduction
Understanding the covariance matrix of asset returns is of paramount importance as asset pricing theories dictate that the distribution of returns are related to the business cycle and consumption states, which affect the demands for holding financial assets and generate timevarying risk premia (Moskowitz, 2003). However, characterizing the covariance structure of returns can be challenging, particularly when the number of assets is large. This also creates an acute problem for investors trying to minimize their portfolio risk, when the sample covariance matrix can not be inverted.
We now take the perspective of a global investor and consider an even more difficult challenge that the stock returns are from various industries in multiple countries across the world. To enable estimation, we employ a countryindustry Kronecker structure model given the short sample period relative to the tremendous cross section of asset returns. However, cautions need to be taken, as these countries can be categorized into emerging markets and developed countries, assuming the same covariance matrix of industry returns for these two groups could lead to undesirable consequences in making optimal investment decision. Indeed, economic and finance theories suggest that industry returns may covary differently across the two groups of countries. As pointed out by Bekaert and Harvey (1995), developed markets are more financially integrated while emerging markets are more financially segmented, thus industries may have different amount of systematic risk depending on the level of segmentation. In other words, industries comove with the market, an aggregation of all industries, to different degrees, in emerging markets and developed countries.
As the market return is an aggregation of all industries returns, therefore an industry’s systematic risk is simply the valueweighted average of its covariance with other industries, over its own variance. Therefore, we focus on the correlation matrices and propose that they should be significantly different between the two groups of countries. Specifically, emerging markets are characterized by frequent regime switches, and sudden changes of fiscal, monetary and trade policies
(Aguiar and Gopinath, 2007). When these economic policies change frequently in an unanticipated way in emerging markets, they tend to make cyclical sectors more procyclical than those in developed markets. Furthermore, Kohn et al. (2018) show that emerging economies produce more commodities than they consume while developed markets do not. Therefore we also expect to detect larger comovements of commodity industries returns with others in emerging countries. Finally, due to different demographic patterns (DellaVigna and Pollet, 2007), we also expect in emerging countries, recreative business industries covary more with the market returns.Combined, these propositions all highlight the necessity of rigourous statistical testing of the equality of the correlation matrices from the two groups of countries. In the literature, researchers often treat the returns of multiple industries as vectorvalued observations over time in each country (Fama and French, 1997; Hong et al., 2007). With the vectorvalued approach, as typically the number of assets far exceeds the length of the time series and the number of countries, estimating the correlation matrix can be so challenging that certain potentially problematic assumption of the temporal independence must be made. In this case, adopting the vectorbased approach of two sample test, such as Li and Chen (2012); Cai et al. (2013a); Cai and Zhang (2016); Chang et al. (2017); Zheng et al. (2019), will be either infeasible due to the small number of observations, or invalid due to the violation of temporal independence.
The goal of this article is to perform hypothesis test of the equality of the two correlation matrices by considering observations that are matrixvalued. Compared to the conventional vectorvalued observations, the matrixvalued observations add one more dimension, corresponding to the time domain. The temporal dimension is allowed to have a wide range of dependence, which is more flexible than in the vectorbased approach. The addition of the temporal dimension also alleviates the problem of small sample size and insufficient length of the time series as seen later.
To be specific, let correspond to the two groups of countries, emerging and developed, respectively. There are (resp. ) countries in the emerging (resp. developed) group. Denote , for , the matrix of returns for country in group . Each matrix is of size when there are industries and time points (in the application we consider later, there are months). We are interested in the inference on the correlation matrix , where is the vectorization operation that stacks the columns of a matrix into a long vector. As the correlation matrix is of enormous size while the sample size is only , it is difficult to estimate or make inference unless further assumption is considered.
A common and intuitive assumption of the matrixvalued observation is the matrix normal distribution, where the covariance matrix has the Kronecker product structure, that is, ; see for example Leng and Tang (2012); Yin and Li (2012); Zhou et al. (2014); Qiu et al. (2016); Han et al. (2016); Zhu and Li (2018). Here, is the covariance matrix of size for the covariances between industries in group and is the covariance matrix along the temporal dimension in group . This Kronecker product reduces the number of unknown parameters from to . Furthermore, much smaller sample size is needed: without the Kronecker product structure, a sample size of is necessary to make the sample covariance matrix full rank, while with the structure, the sample size is sufficient as long as and . Moreover, the Kronecker structure has been widely adopted in the asset pricing literature, such as conditional factor models of Brandt and SantaClara (2006); Brandt et al. (2009). We further verify the Kronecker product assumption in the stock market application via the hypothesis testing method of Aston et al. (2017). See Section 6.2 for the details.
Under such matrix normal assumption, our goal is to test the equality of the correlation matrices of the industries by considering as nuisance parameters. Consider the correlation matrices: , for , where is the diagonal matrix consisting of the diagonal entries of . As such, we test
(1) 
This is referred to as the twosample hypothesis test of the equality of the correlation matrices of the two groups with matrixvalued observations.
Furthermore, it is also of interest to test whether the columns of the matrixvalued observations are independent. This is important because, if indeed there is no temporal correlation, then the vectorbased approach can be implemented. This goal can be achieved by testing
(2) 
within group . Similarly, independence of the industries within either group can also be tested via is diagonal. These are referred to as the onesample hypothesis test of the independence of the columns, or rows, respectively, of the matrixvalued observations.
Moreover, when the null hypothesis of the onesample hypothesis test is rejected, it is of further interest to identify which months or which industries have nonzero correlations; similarly, when the null hypothesis of the twosample hypothesis test is rejected, it is important to further identify which industries have significantly different correlations among emerging countries versus developed countries. These are referred to as
support recovery problems.In our real data application, we employ a comprehensive sample of 30 industry sector returns from 43 countries around the world from 2001:072017:12. As a prelude, the onesample null hypothesis, is diagonal, is rejected by our method introduced in Section 2, suggesting the existence of significant temporal correlation. This implies that we can not use the aforementioned vectorbased twosample tests, and manifests the need of developing a method for the twosample hypothesis test directly using matrixvalued observations (Section 3). According to our analysis (Section 6), the twosample null hypothesis, , is also rejected, so we indeed identify significant differences in correlations across the two groups of countries. Furthermore, our support recovery analysis finds consistent evidence with existing economic propositions.
For vectorvalued observations, there have been numerous efforts in the estimation and inference on the covariance/correlation/precision matrix. From the aspect of estimation, a good number of methods were proposed to estimate the covariance/correlation matrix for the vector case (Bickel and Levina, 2008; Rothman et al., 2009; Cai and Liu, 2011; Cai et al., 2012; Han and Liu, 2013; Cai and Zhang, 2016, e.g.). Meanwhile, various methods of estimating the precision matrix have also been proposed (Meinshausen et al., 2006; Yuan and Lin, 2007; Friedman et al., 2008; Ravikumar et al., 2011; Cai et al., 2011, e.g.), and some other works extend the single precision matrix to multiple precision matrices (Danaher et al., 2014; Zhu et al., 2014; Cai et al., 2016, e.g.). From the other aspect of inference, hypothesis testing procedures for vector data have been developed recently. In particular, Cai and Jiang (2011); Li and Chen (2012); Cai et al. (2013a, b); Cai and Zhang (2016); Chang et al. (2017); Zheng et al. (2019), for example, considered the onesample or twosample covariance/correlation matrix testing problem in highdimensions. To investigate the graphical models, Liu et al. (2013) and Xia et al. (2015), for example, proposed procedures to test the property of the precision matrix under onesample or twosample settings.
For matrixvalued observations, matrix normal distribution, where the covariance matrix has the Kronecker product structure, has been frequently assumed. Under the matrix normal distribution, most of the existing works focus on the precision matrix, from either estimation or testing perspective. For instance, to inspect the graph structure, Leng and Tang (2012); Yin and Li (2012); Zhou et al. (2014) proposed methods to estimate the precision matrix; Qiu et al. (2016); Han et al. (2016); Zhu and Li (2018) extended further to the joint estimation of multiple precision matrices; and Xia and Li (2017, 2018) studied the onesample and twosample hypothesis testing of the structure of the precision matrices.
However, to the best of our knowledge, with respect to the covariance or correlation matrix for matrixvalued observations, the literature on either estimation or inference is rather scarce. Table 1 summarizes the status of literature on the hypothesis testing for both vectorvalued and matrixvalued data under onesample and twosample regimes. This article will fill in the blank of hypothesis testing of correlation structures under the matrix normal assumption in both onesample and twosample cases. The matrixvalued covariance matrix estimation problem is a promising future direction.
Vectorvalued data  Matrixvalued data  
Covariance or  Onesample  This article  
Correlation matrix  Twosample  This article  
Precision matrix  Onesample  
Twosample 
Matrixvalued or tensorvalued data are ubiquitous nowadays. When dealing with such data, and sometimes even vectorvalued data, Kronecker product structure has been a powerful tool because of its ability to approximate an arbitrary matrix
(Cai et al., 2019) and reduce dimensionality. Hafner et al. (2019) used Kronecker product to approximate the covariance matrix for vectorvalued data and aimed to estimate the approximated covariance matrix. Chen et al. (2018)investigated matrix autoregressive models where the coefficient matrix has Kronecker product structure. For tensorvalued time series,
Wang et al. (2019); Chen et al. (2019a); Chen and Chen (2019); Chen et al. (2019b) assumed that the tensor factor model has a signal that exhibits Kronecker structure. Aston et al. (2017); Constantinou et al. (2017) performed a test of the separability of terms in the Kronecker product. Molstad and Rothman (2019) proposed an algorithm to fit the linear discriminant analysis model with Kronecker product. These articles demonstrate a wide range of applications in finance, economics, engineering, neuroimaging, geophysics, and many more.The rest of the article is organized as follows. Section 2 is devoted to the onesample global hypothesis testing on the independence of the columns or the rows of matrixvalued observations and the recovery of the dependent entries when the global hypothesis test is rejected. Section 3 is dedicated to the twosample global hypothesis testing of the equality of two correlation matrices along one dimension of the matrixvalued observations (in two groups), and the support recovery of the difference of the two correlation matrices. Section 4 establishes the theoretical properties of these procedures for both onesample and twosample settings. The numerical comparison of our procedures with existing ones via simulation is provided in Section 5 and the real data analysis of the aforementioned stock returns data is given in Section 6. The proofs are delegated to Appendix.
2 OneSample Testing of Independence
To formulate the stock return example in terms of the matrixvalued twosample hypothesis testing problem as introduced in (1), we shall first check whether the independence assumption hold for the temporal dimension. Hence, we start with the onesample testing of (2), which is easier to comprehend due to its simple structure and notation. In the following subsections, we present the onesample testing of independence and defer the discussion of the twosample correlation matrix equality testing to Section 3. We omit the superscript that denotes the group membership.
Suppose there are
independent and identically distributed (i.i.d.) centered random matrixvalued observations
, each with dimension , from a matrix normal distribution , where is a matrix of entries of 0, and the matrix and matrix are the covariance matrices associated with the rows and columns respectively. The vectorization is a vector of length following a multivariate normal distribution with mean zero and a covariance matrix of the form . Denote and . Without loss of generality (WLOG), we derive the testing procedure below for testing independence relating to the matrix . Note that we can simply transpose the observation so that the roles of and are switched and the procedure to test can be used to test after the transpose.Our goals are to test the null hypothesis globally
(3) 
and to identify nonzero entries , both of which are invariant up to a constant. As such, even though and are not identifiable as and will lead to the same matrix normal distribution for any positive scalar , this has no effect on the global hypothesis testing procedure of Section 2.1 and the support recovery approach of Section 2.2. Throughout the paper, we use to denote constants whose values may change from line to line.
2.1 Global Testing Procedure
To test the property of
, it is natural to construct the test statistic based on an estimate of
. A naive estimate of is , which can also be rewritten as , where denotes the th column of matrix . In the stock return example, is a length vector, representing the return of industries of country during month . This naive estimate is the same as the sample covariance matrix for vectorvalued observations if we treat , for and , as i.i.d. observations. Note that these observations are i.i.d. only whenis a multiple of an identity matrix, which implies no temporal correlation and is typically unrealistic. According to the definition of matrix normal distribution, the covariance matrix of any column is proportional to the matrix
, i.e., for all . It then follows that, for the naive estimate , there exists a constant such that , and henceis an unbiased estimate of
. Similarly, there exists a constant such that is an unbiased estimate of , where(4) 
However, the above naive estimation is not efficient and can be improved further as follows.
Consider , for . Because of the property of matrix normal distribution, we have , which implies that all of the columns of follow i.i.d. multivariate normal distribution with covariance . This is equivalent to observing i.i.d. random vectors with covariance . Rightmultiplying matrix
can be roughly thought of as the prewhitening of the matrix normal distribution where the column covariance becomes identity after the linear transformation. Therefore, when
is known, is the most efficient and oracle estimate of . Of course, is often unknown in practice, in which case, plugging in a legitimate estimate of is a natural approach and we choose as a candidate. This idea leads to the following estimate of ,(5) 
where is defined in (4). Note that when ,
defined above is invertible with probability one. We further comment that there are many appropriate choices for the estimation of
besides the simple sample estimator as long as it satisfies the equation (39) in our proof. This may lead us to use, for example, the banded estimator in Rothman et al. (2010), the adaptive thresholding estimator in Cai and Liu (2011), etc., if we have the prior information on the structure of .To test whether is diagonal in (3), it is tempting to consider the magnitude of all the offdiagonal entries of in (5). However, the estimate in (5) cannot be used directly yet, because can have different levels of variability. Recall the simple onesample
test for the mean of i.i.d. random variables, the test statistic is based on the ratio of the estimate of the mean and its standard error. To treat all the offdiagonal entries
, in a fair manner, it is necessary to standardize first.In order to standardize, we reexamine the construction of . Since (5) can be reexpressed as
it has the oracle counterpart when is known:
whose entries are
Then, it is natural to define the relevant population variances as
(6) 
for all . Note that the definition of above does not depend on nor . Given the observations , the estimates of these variances can be obtained by
(7) 
The variance of can be estimated by . Similar spirit of the estimation of the variances has been used in Cai and Liu (2011) and Cai et al. (2013a), where the observations are vectorvalued and do not need prewhitening or the pluggedin estimate , while ours are matrixvalued and the estimation is more involved.
We now can define the standardized statistics
(8) 
where and are defined in (5) and (7) respectively. The ’s are on the same scale and can be compared together. It is also seen that doesn’t depend on as the constant in the numerator and denominator of (8) is cancelled. WLOG, we set for the rest of the article.
Note that the null hypothesis is diagonal is equivalent to all of the offdiagonal entries of are zero, and hence further equivalent to the maximum of all the offdiagonal entries is zero, i.e., . Therefore, it is natural to construct the following test statistic,
(9) 
where is the standardized statistic for the th entry in (8). Under the alternative hypothesis, there exists at least one nonzero offdiagonal entry , whose associated statistic is large, and the maximum test statistic will be large. Therefore, the null hypothesis should be rejected for large value of the test statistic .
To perform hypothesis test based on the test statistic , we further need to establish its null distribution. The exact theoretical property of its limiting behavior will be discussed in details in Section 4. For now, we can still obtain some intuition of the critical value. Roughly speaking, under the null hypothesis, each is approximately the square of a standard normal random variable due to standardization, and under certain conditions, the ’s are only weakly correlated with each other. So loosely speaking, the test statistic is the maximum of squared normals that are weakly dependent. Since the extreme value of the square of i.i.d. normal random variables is close to , is close to under . To be precise, theorems in Section 4 will show rigourously that under the null distribution and certain regularity assumptions, converges to a Gumbel distribution. Due to this limiting distribution, for any significance level , we can define the global test by
(10) 
where is the indicator function. Here, the quantity
(11) 
is the quantile of the Gumbel distribution with the cumulative distribution function (cdf) . The null hypothesis is diagonal is rejected whenever .
We comment that since is the maximum of , the test is best suited for the case when the alternative hypothesis is sparse, that is, when only a small number of the offdiagonal entries of the covariance matrix are large. As long as one of the offdiagonal entries is large enough, the test will reject the null hypothesis. This test does not assume any other structure of the alternative hypothesis. In Section 4, we will show that this test is optimal against sparse alternatives. Note that, when the alternative is dense and many small offdiagonal entries exist, the proposed test is less capable of rejecting the null. Nevertheless, the large body of literature on portfolio construction typically assumes i.i.d excess returns and all serial correlations are zero (for a survey, see Brandt (2009)). In practice, the temporal correlations are more apparent in daily or even weekly returns due to nonsynchronous trading or the bidask bounce effect, but much less so at monthly frequency so most of them may not be different from zero (Campbell et al., 1997).
It is also worth mentioning that the standardized statistics ’s are useful by themselves to recover the support of . In other words, we can identify the locations of the nonzero entries of by examining the values of , as we now discuss in the next section.
2.2 Support Recovery Procedure
We have focused on the test of the independence of the rows of by testing globally whether all of the offdiagonal entries of the row covariance matrix are zeros in Section 2.1. If the null hypothesis is rejected, it is of great value to locate the places where the covariances are not zero. Taking the stock return data for example, if the independence of the months is rejected (the matrixvalued observations need to be transposed before feeding into the testing procedure), one may want to identify which months are highly correlated, and if the independence of industries is rejected, it might be interesting to know which industries are correlated. Another example is brain imaging analysis, where the matrixvalued observations for patients are spatialtemporal data (Xia and Li, 2017, 2018, e.g.), and it is worthwhile investigating further how voxels of the brains are correlated after the rejection of independence of voxels. This is called the problem of support recovery.
This problem can be thought of as simultaneous testing of whether the offdiagonal entries of the covariance matrix are zero. Let the support of , neglecting the diagonal entries, be
(12) 
Since there are offdiagonal covariances for support recovery, based on the extreme value theory, we can threshold the offdiagonal entries at the following level to obtain the estimate of the support,
(13) 
where the ’s are previously defined in (8), and is a threshold constant. Section 4 will show that when , the probability of exact recovery goes to 1 asymptotically if the nonzero entries are large enough. This is intuitive as is close to . Section 4 will further demonstrate that a smaller choice will fail to recover the support under certain conditions; therefore is optimal. We remark here that, we aim for the asymptotic exact recovery of the support in this and the following sections, while for other purposes, one may refer to alternative multiple testing approaches with familywise error rate or false discovery rate control.
3 TwoSample Testing of Correlation Matrix Equality
Having derived the procedure for the (onesample) testing of independence, we can extend the approach to the twosample scenario of testing the equality of two correlation matrices. Following the same notation as in the introduction, we have i.i.d. matrixvalued observations from matrix normal distribution for two groups . Considering the definition of the correlation matrices for the two groups in the introduction, we wish to test
(14) 
Hereafter, we use the superscript to distinguish the quantities that are of relevance to the twosample case from the onesample case. To make inference about the correlation matrices, the estimates of these correlation matrices need to be constructed.
Given the observations and , as discussed for the onesample case in Section 2
, we can first construct the estimates of the covariance matrices for the two groups and obtain the estimate of the correlation matrix for each group by dividing the covariance matrix with the corresponding standard deviation as follows,
(15)  
(16) 
where is the naive estimate of . Again, we cannot directly make inference based on
, because they are heteroscedastic. To make them homoscedastic, define the entrywise population variance and the sample counterpart similarly as in (
6) and (7),As such, the variance of can be estimated by , where
Consequently, the variance of can be estimated by . Note that, for vectorvalued observations, to test the equality of the correlations from two populations, Cai and Zhang (2016) estimated the variance by a careful investigation of the Taylor expansion in the calculation of correlation from covariance, and Cai and Liu (2016) introduced a variance stabilization method based on Fisher’s transformation. Our approach is different from both methods.
When we focus on a single entry of the hypothesis in (14) such as , in accordance with the twosample test with unequal variances for i.i.d. random variables, it is natural to define the standardized statistic as
(17) 
and the maximum test statistic as
(18) 
Because the diagonal entries of the correlation matrix are all 1, the maximum is only taken over offdiagonal entries. The in the twosample scenario has similar properties as the (9) in the onesample scenario; see the comment in the paragraph after (11). It will be proven in Section 4 that also converges to a Gumbel distribution under and certain regularity assumptions. Therefore, for a given significance level , the test can be defined in parallel as (10),
(19) 
where is still the quantile of the Gumbel distribution and its expression is in (11). The hypothesis is rejected whenever .
To find which industries have correlations that are significantly different between emerging countries and developed countries, we need to recover the support of the difference of the correlation matrices between the two groups of countries. Denote the support of by
We threshold the entrywise statistic in (17) at an appropriate level to obtain the estimated support as
(20) 
where is again the threshold constant and the choice of is optimal as shown in Section 4.
4 Theoretical Properties
We present the theoretical properties of the procedures for the onesample case in Section 4.1 and the twosample case in Section 4.2.
The following conventions for notations are adopted. Throughout the article, for a length vector , denote its Euclidean norm by . For a size matrix , denote its Frobenius norm by and its spectral norm by . For a matrix , let and
be its largest and smallest eigenvalues respectively. Denote its matrix 1norm as
. For two sequences of real numbers and , write (respectively ) if there exists a constant such that (respectively ) holds for all sufficiently large and write if .4.1 Theoretical Properties for Testing of Independence
We will provide the theoretical justifications of the global testing procedure (10) and support recovery procedure (13). For the global testing procedure, its theoretical properties will be established from two perspectives: the size and the power. Specifically, to study the asymptotic size of the test, we prove the asymptotic distribution of the test statistic under the null hypothesis; to analyze the power, we consider the sparse alternatives where only a small subset of the entries are nonzero. For the support recovery procedure, we will show that recovers the support with probability tending to one under certain conditions.
Under Conditions (C1) and (C2), as mentioned in Section 2, Theorem 1 shows that indeed converges weakly to a Gumbel distribution under the null hypothesis.

Assume that , , and there are some constant such that, , and .

Assume that = .
Condition (C1) on the eigenvalues of the covariance matrices is commonly assumed in the highdimensional setting. It implies that the majority of the variables are not highly correlated with the others in either the row direction or the column direction. Condition (C2) is mild and is assumed to ensure that defined in (4), as the estimation of the inverse of the nuisance covariance , is reasonably accurate. As such, the oracle estimate will be close to the estimate in (5) as will be shown in the proofs. In the special case when is bounded, (C2) essentially implies that the nuisance dimension can be of a polynomial order of .
Theorem 1.
Suppose that the regularity conditions (C1) and (C2) hold. Then under , for any ,
(21) 
as . Furthermore, under , the convergence in (21) is uniform for all satisfying (C1)(C2).
We next turn to the power analysis of the test . In order to perform the power analysis, we focus on sparse alternative hypothesis, as explained in Section 2.1, and define the following class of covariance matrices associated with the row direction of the matrixvalued observations:
(22) 
where was defined previously in (6). Note that this class of covariance matrices only requires one element to be large enough, . As , it essentially requires only one offdiagonal entry of to be larger than . For such matrices with as the alternative hypothesis, Theorem 2 shows that can distinguish the alternative hypothesis from the null hypothesis, where the offdiagonal entries of are all zero, asymptotically. In other words, is rejected by with probability tending to 1 if .
Theorem 2.
Suppose that Conditions (C1) and (C2) hold. As , we have
Theorem 3 further demonstrates that the lower bound of in the definition of the class of covariance matrices is rate optimal. Let be the set of level tests, i.e., we have under the null hypothesis for any test .
Theorem 3.
Suppose that . Let and . There exists some constant such that for all sufficiently large and ,
The above theorem implies that, when is small enough, with probability going to one, any level test cannot reject the null hypothesis uniformly over . As a consequence, the rate as the lower bound of cannot be improved.
To sum up, Theorems 13 suggest that the test defined in Section 2 has asymptotic level , it has power one asymptotically under certain sparse alternative hypothesis, and the rate requirement on the sparse alternative is the weakest possible one.
To study the theoretical property of the support recovery procedure in (13), recall the definition of the support of in (12) and define the following class of covariance matrices in parallel with (22):
Note that requires the maximum of to be lower bounded by while requires the minimum of over the support is lower bounded by the same quantity. This requirement essentially means that all of the entries over the support are sufficiently large and thus can be distinguished from the noise. Then Theorem 4 below shows that the estimator with threshold constant recovers the support perfectly with probability going to when the magnitudes of all the nonzero offdiagonal entries are above certain thresholds as in .
Theorem 4.
Suppose that Conditions (C1) and (C2) hold. As , we have
Remark 1.
With the same reasoning as in Cai et al. (2013a), it can be easily verified that the choice of the threshold constant is optimal. As a matter of fact, for any , the probability of exact recovery of the support goes to zero. The failure of exact recovery is because the small threshold of will estimate some of the zero entries by nonzero values, i.e., the estimated support will be larger than the true support. In addition, the rate of as the requirement of the nonzero entries of cannot be relaxed.
4.2 Theoretical Properties for Testing of Correlation Matrix Equality
For the twosample testing of correlations, we assume the sample sizes from the two groups are comparable, , and write in this section.
The Conditions (C1)(C2) in the onesample case need to be replaced by the following conditions for the twosample case.

Assume that , , and there are some constant such that, , and , for .

Assume that = , for .

There exists some such that for any sufficiently small constant , where the set is defined as
Note that, Conditions (C1) and (C2) are the twosample analog of the onesample conditions (C1) and (C2). Condition (C3) ensures that most of the variables are not highly correlated with each other.
Under appropriate regularity conditions, Theorems 58 are the twosample counterparts of the onesample Theorems 14. In particular, Theorem 5 shows the limiting distribution of (18) under the null hypothesis and proves that (19) has level asymptotically, Theorem 6 provides the power analysis of , Theorem 7 demonstrates the optimality of the test, and Theorem 8 states the exact support recovery property of .
Theorem 5.
To analyze the power of , in parallel with (22), define the following class of matrices:
where . We have the following result.
Theorem 6.
Suppose that Conditions (C1)(C3) hold. As , we have
Note that is able to distinguish the alternative from the null so long as one entry satisfies the requirement .
The above rate is optimal because of the next theorem. Let be the set of all level tests, i.e., under for any .
Theorem 7.
Suppose that . Let and . There exists some constant such that for all large and ,
Construct the set of matrices whose support has the rate defined above, namely,
Theorem 8 claims that
Comments
There are no comments yet.