Controlling the false discovery rate (FDR, ) has become a routine practice in multiple hypothesis testing. Recently, weighted FDR procedures such as those of [8, 9] have exemplified excellent performances due to their abilities to better adapt to the proportion of signals or incorporate potential structures among the hypotheses. The “adaptive one-way GBH (GBH)” procedure of  likely represents the latest advance on designing data-adaptive weights that ensure the non-asymptotic conservativeness of the resultant testing procedure for grouped, weighted hypothesis testing, and reduces to Storey’s procedure of  when there is only one group. Even though these procedures have been shown to be conservative under independence, non-asymptotically gauging their FDRs under dependence is quite challenging. There has been some numerical evidence on the non-asymptotic conservativeness of Storey’s procedure and the GBH when they are applied to multiple testing the means of equally correlated normal random variables; see, e.g., [5, 6, 8]. However, a theoretical investigation into this does not seem to exist in the literature. In this note, we provide an analytic, non-asymptotic FDR upper bound for the GBH in the aforementioned multiple testing scenario. The bound is not tight but quantifies the maximal FDR of the GBH correspondingly. As by-products, Lemma 3 extends Lemma 3.2 of , and Lemma 4 extends Lemma 1 of , both to the setting where p-values are not necessarily super-uniform.
We begin with the testing problem. Let be i.i.d. standard normal, where is defined to be the set for each natural number . For a constant , let for . Then ’s are exchangeable and equally correlated with correlation . We simultaneously test hypotheses versus for . This scenario has been commonly used as a “standard model” to assess the conservativeness of an FDR procedure under dependence by, e.g., [3, 6, 7, 8]. For each , consider its associated p-value , where
is the CDF of the standard normal distribution. The GBH, to be applied to , is stated as follows:
Group hypotheses: let the non-empty sets be a partition of , and accordingly let be partitioned into for .
Construct data-adaptive weights: fix a , the tuning parameter, and for each and , set
where and with being the indicator function of a set , and is the cardinality of .
Weight p-values and reject hypotheses: weight the p-values , into , and apply the BH procedure to at nominal FDR level .
Here is our main result:
When and , the FDR of GBH is upper bounded by
In the theorem we restrict mainly because researchers often choose or in practice (see  and ). Also, the requirement for is to ensure some integrals to be finite in the proof of Theorem 1, and the interval is obtained by solving and resulting from the calculations for the integrals in (6) when . The ratio for is partially visualized by Figure 1.
From Figure 1, we see that is increasing in but decreasing in . Further, the ratio is always less than when , and is less than when , making the upper bound useful for a good range of when . On the other hand, is achieved when and . However, the case of corresponds to independence among the normal random variables and hence among the p-values, for which the FDR of GBH is upper bounded by . So, is not tight. This is mainly because we used the suprema of several quantities related to ; see Lemma 2 and Lemma 3.
2 Proof of Theorem 1
2.1 A streamlined proof
Let and be respectively numbers of false rejections and total rejections of GBH, we first consider the conditional expectation . Since is set when , we can assume throughout the article. Let be the index set of true null hypotheses among the hypotheses, the index set of true null hypotheses for group , and the cardinality of . Further let
be the vector of thep-values, and the vector obtained by excluding from . Then
where for each and
with and , and the inequality is due to the fact that is non-decreasing in for all , and .
Define , , , and for each and . Set . Then the inequality (1) implies
where the last equality follows from . Let be the FDR of the GBH procedure. Then with (4) we obtain
where is given in the statement of Theorem 1.
2.2 An upper bound related to the probability of a conditional false rejection
, we have the “probability of a conditional false rejection” as
which induces the ratio
Note that is set since holds, and that . The key result in this subsection is an upper bound on (or introduced later), given by Lemma 2.
First, let us verify that is upper bounded on . Setting and gives an equivalent representation of as . Clearly, when , and when . However, is continuous for . So, attains its maximum at some and is thus bounded on .
Secondly, let us find an upper bound for . Setting with and gives another equivalent representation of as . So, it suffices to upper bound on . Clearly, when , when , , and . So , and it suffices to upper bound on . To this end, we need the following:
With Lemma 1, we can obtain an upper bound for (or ) as follows:
For we have , where
We will divide the arguments for two cases. Case (1): (i.e., . Regardless of the values of and , we have for . On the other hand,
for each , and . So, , and when .