# A Family-based Graphical Approach for Testing Hierarchically Ordered Families of Hypotheses

In applications of clinical trials, tested hypotheses are often grouped as multiple hierarchically ordered families. To test such structured hypotheses, various gatekeeping strategies have been developed in the literature, such as series gatekeeping, parallel gatekeeping, tree-structured gatekeeping strategies, etc. However, these gatekeeping strategies are often either non-intuitive or less flexible when addressing increasingly complex logical relationships among families of hypotheses. In order to overcome the issue, in this paper, we develop a new family-based graphical approach, which can easily derive and visualize different gatekeeping strategies. In the proposed approach, a directed and weighted graph is used to represent the generated gatekeeping strategy where each node corresponds to a family of hypotheses and two simple updating rules are used for updating the critical value of each family and the transition coefficient between any two families. Theoretically, we show that the proposed graphical approach strongly controls the overall familywise error rate at a pre-specified level. Through some case studies and a real clinical example, we demonstrate simplicity and flexibility of the proposed approach.

## Authors

• 1 publication
• 13 publications
• 3 publications
04/03/2020

### Graphical approaches for the control of generalised error rates

When simultaneously testing multiple hypotheses, the usual approach in t...
12/01/2018

### A New Approach for Large Scale Multiple Testing with Application to FDR Control for Graphically Structured Hypotheses

In many large scale multiple testing applications, the hypotheses often ...
08/23/2020

### False discovery rate envelope for functional test statistics

False discovery rate (FDR) is a common way to control the number of fals...
02/14/2017

### Simflowny 2: An upgraded platform for scientific modeling and simulation

Simflowny is an open platform which automatically generates parallel cod...
10/26/2020

### Bayesian Multivariate Probability of Success Using Historical Data with Strict Control of Family-wise Error Rate

Given the cost and duration of phase III and phase IV clinical trials, t...
08/27/2019

### Optimizing Graphical Procedures for Multiplicity Control in a Confirmatory Clinical Trial via Deep Learning

In confirmatory clinical trials, it has been proposed [Bretz et al., 200...
01/31/2018

### A family of OWA operators based on Faulhaber's formulas

In this paper we develop a new family of Ordered Weighted Averaging (OWA...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In clinical trial research, it is becoming increasingly common to consider the problems of complex multiple testing due to hierarchically ordered multiple objectives. In these problems, the hypotheses to be tested are usually grouped into multiple families, and these families are tested in a sequential manner. For example, there are usually multiple endpoints of interest in clinical trials and these endpoints are generally classified as primary, secondary and sometimes tertiary endpoints which form a natural hierarchical structure. To deal with such structured multiple testing problems, Maurer, Hothorn and Lehmacher (1995) and Bauer et al. (1998) introduced a convenient and efficient way called gatekeeping strategy based on which hypotheses in one family cannot be tested if the testing results of the previous families do not meet some pre-specified gatekeeping conditions. Basically, there are two types of gatekeeping strategies. One is serial gatekeeping (Westfall and Krishen, 2001) in which each family can be tested using any FWER controlling procedure if and only if all hypotheses in the previous families are rejected. The other is parallel gatekeeping (Dmitrienko, Offen and Westfall, 2003) in which the subsequent family can be tested if and only if at least one hypothesis in current family is rejected.

Tree-structured gatekeeping strategy introduced by Dmitrienko, Wiens and Tamhane (2007) and its extension, mixture procedure, introduced by Dmitrienko and Tamhane (2011, 2013) were also developed for testing hierarchically ordered families of hypotheses with complex logical relationships. However, both the tree-structured gatekeeping strategy and mixture procedure were derived based on the closure principle of Marcus et al. (1976). Thus, to implement these procedures, intensive computation is unavoidable. To avoid such complex computational issue caused by the closure principle, Dmitrienko, Tamhane, Wang and Chen (2006), Guibaud (2007) and Dmitrienko, Tamhane and Wiens (2008) developed a simple stepwise approach for implementing gatekeeping strategies. Dmitrienko, Tamhane and Wiens (2008) introduced a general multistage gatekeeping procedure, which unified the above works. Due to the stepwise shortcut, the multistage gatekeeping procedure is apparently more straightforward and easier to explain to the clinicians in practice. However, to deal with complex logical restrictions, multistage gatekeeping procedure is less flexible compared with the mixture procedure, although the latter is computationally intensive.

With increasing complexity of hierarchically logical restrictions of gatekeeping strategies, the proper visualization and presentation of such strategies will be very helpful for users. To develop such visualization tool, one solution is to employ the idea of graphical approaches proposed by Bretz et al. (2009) and Burman et al. (2009). The graphical approaches have been used to for sequentially testing hierarchically structured hypotheses, such as superchain procedure proposed by Kordzakhia and Dmitrienko (2013), where each family is presented as a vertex and the local significance levels are propagated via transition coefficients between families instead of hypotheses. However, this approach tests all families of hypotheses simultaneously at each step which is not suitable in most clinical trial settings, such as families of hypotheses having hierarchical structure. Maurer and Bretz (2014) developed a graphical approach for testing families of hypotheses which is able to visualize the serial gatekeeping procedure in the sense that only if all hypotheses in a single family are rejected, the graph can be updated.

In this paper, we are motivated to propose a new family-based graphical approach which can be more flexible to visualize the hierarchically logical restrictions of the usual gatekeeping procedures than the existing graphical approaches. This approach can serve as an extension of multistage gatekeeping procedure in the sense that it not only takes advantage of the stepwise algorithm but also deals with more general logical restrictions than the multistage gatekeeping procedure. For example, the proposed graphical approach can also be applied to some complex multiple testing problems where equally important families of hypotheses are grouped in the same layer, e.g, primary endpoints and co-primary endpoints.

The rest of the paper is organized as follows. We discuss our research motivation through an example and briefly introduce the idea of our family-based graphical approach in Section 2.1. We then present some basic notations and assumptions in Section 2.2. In Section 3, we introduce the general algorithm for sequentially testing families of hypotheses and show its overall FWER control. In Section 4, we show the advantages of our approach through three case studies in Bretz et al. (2009). A real data analysis is performed in Section 5. Some concluding remarks are made in Section 6 and all proofs are deferred to Appendix.

## 2 Preliminary

In this section, we will discuss our research motivation though a heuristic example and introduce some basic notations and assumptions.

### 2.1 Heuristics

Bretz et al. (2009) introduce a general graphical approach which provides a graphical tool to visualize Bonferroni-adjusted gatekeeping procedures. As an example, Figure 1 shows such graphical visualization of the parallel gatekeeping strategy based on a truncated Holm procedure that is used for testing four hypotheses grouped as two families, where each hypothesis is represented by a vertex.

Compared with the conventional multiple testing procedures for testing a single family of hypotheses, the hypothesis-based graphical approach is indeed explicit and efficient. However, in practice, increasingly complex clinical trials problems often involve testing multiple ordered families of hypotheses, which makes such hypothesis-based graphes more complicated, even not applicable in some settings of a large number of families.

Consider an example that 9 hypotheses are grouped into 3 families where each family consists of three hypotheses, denoted as , for . Suppose that and are sequentially tested by a truncated Holm procedure and is tested by the conventional Holm procedure. The subsequent family of hypotheses can be tested if and only if at least one hypothesis in the current family is rejected. Figure 2 illustrates the hypothesis-based graphical visualization of the parallel gatekeeping strategy. Due to its complexity, the weights on the edges are omitted in this graph. As seen from Figure 2, the hypothesis-based graph is relatively unclear and complicated, although it only involves testing 3 families of hypotheses.

While testing multiple families of hypotheses, hierarchically logical restrictions among the families are often one important aspect. Thus, it is natural for us to focus more on the logical relationships at family level rather than at hypothesis level, to develop a graphical approach for visualizing conventional gatekeeping strategies for testing multiple ordered families of hypotheses. By using the similar idea as in Kordzakhia and Dmitrienko (2013), we use a vertex to represent a family of hypotheses instead of an individual hypothesis and a directed edge with a pre-specified weight associated with it to represent the transition relationship between two families. We term this approach as family-based graphical approach. For the example illustrated in Figure 2, an equivalent family-based graph is shown in Figure 3 (a), where the families are represented by vertices. As seen from Figure 3 (a), we start testing at level ; the subsequent family can be tested if and only if at least one rejection is made while testing the current family . The allocation of the critical values among families is via transition coefficients on the edges, that is, after rejections are made in one family, the critical value of this family is proportionally transferred to the subsequent families based on the transition coefficients on the edges from the family to the subsequent families. For more details of the updating rule, see Section 3.

To make the example in Figure 3 (a) more interesting, consider a specific parallel gatekeeping strategy for which the initial critical values of and are respectively and and except for transferring to , of the critical value of can be passed down to if at least one hypothesis is rejected in . Figure 3 (b) illustrates the family-based graph of this parallel gatekeep strategy. As seen from Figure 3 (b), even when there is no rejections in , the subsequent and can still be tested at their local critical values.

### 2.2 Basic notations

In this subsection, we present some basic notations and definitions. Suppose there are hypotheses divided into families, which are further grouped into layers, with being the th ordered layer consisting of families of hypotheses, . Each family within layer has null hypotheses, denoted as , for such that . These families of hypotheses are to be tested based on their respective -value

, subject to controlling an overall measure of type I error at a pre-specified level

. Each of the true null

-value is assumed to be stochastically greater than or equal to the uniform distribution on

; that is, if is the set of true null hypotheses in , then for any fixed ,

 Pr{Pijk≤u|Hijk∈Tij}≤u, (1)

for any , , and .

The familywise error rate (FWER), which is the probability of incorrectly rejecting at least one true null hypothesis, is a commonly used notion of an overall measure of type I error when testing a single family of hypotheses. Since we have multiple layers with any number of families within each layer, we consider this measure not locally for each family but globally. In other words, we define the overall FWER as the probability of incorrectly rejecting at least one true null hypothesis across all families of hypotheses for all layers. If it is bounded above by

regardless of which and how many null hypotheses within each family are true for any layer, then this overall FWER is said to be strongly controlled at .

In this paper, we propose a general procedure, called family-based graphical approach, strongly controlling the overall FWER at . Given the pre-specified critical value , let denote the initial critical values assigned to layer with . Moreover, let denote the initial critical values assigned to families within layer with . The procedure starts with testing to sequentially and within each layer , families are tested in any order using any local procedures based on their own (local) critical values. The critical values used to locally test each family within the current layer is updated from its initially assigned value to one which incorporates certain portions of the critical values used in testing the families within the previous layers. This procedure stops testing when all families of the last layer are tested. The specific updating rule for local critical values is described in Section 3. The distribution of the amount of critical values transferred among families can be pre-fixed by a transition coefficient set which is defined as follows.

Let denote a set of all transition coefficients which satisfies the following conditions for any and :

 n∑k=i+1lk∑l=1gijkl≤1;  0≤gijkl≤1;  gijkl=0 if i≥k.

Note that is defined as the proportion of the local critical value that can be transferred from family within layer to family within layer . Figure 4 shows the graphical representation of the general family-based approach.

Based on the initial critical values and the transition coefficients , we can construct a directed acyclic graph for the aforementioned family-based approach. In this graph, each family is represented by a vertex associated with its initial critical value ; for any two vertices corresponding to two respective families and , if the transition coefficient from to is positive, then a directed edge between these two vertices is displayed, where and are head and tail vertices, respectively. Since each vertex is associated with a family instead of a hypothesis, we term the graph as a family-based graph, which is illustrated in Figure 4.

Our specific updating rule for local critical values, which is described in Section 3, is defined based on error rate function introduced in Dmitrienko et al. (2008). The error rate function is defined as follows.

###### Definition 1

(Dmitrienko et al., 2008) Consider a single family of hypotheses, and a multiple testing procedure for testing the family . The error rate function of this procedure is defined as

 e(I)=supHIPr{⋃i∈I{rejectHi}∣∣HI}

for any , where is the intersection of hypotheses with .

Note that in applications, if the error rate function cannot be calculated easily, we often use one of its upper bounds to replace it.

In the family-based approach, each family is tested by its own local procedure, thus it is associated with a particular error rate function. Let denote the local critical value for testing family and denote the set of accepted hypotheses in . Based on , we can calculate after testing at level and then transfer the remaining amount of its local critical value to the respective families in the subsequent layers according to the corresponding transition coefficients.

###### Remark 1

The error rate function introduced in Dmitrienko et al. (2008) was used to develop a simple stepwise approach for parallel gatekeeping strategies. In their discussion, the error rate function is required to be strictly less than unless all of the hypotheses in one family are rejected, which is termed as separability condition. In this paper, the definition of the error rate function we used is a a little bit more general. For this function, the separability condition is not required when choosing local procedures for our suggested family-based graphical approach.

## 3 Methodology

In this section, we introduce a new family-based graphical approach and show its overall FWER control. We begin in Subsection 3.1 with a simple case of two layers with two families of hypotheses within each layer. The general case of multiple layers with arbitrary number of families within each layer is discussed in Subsection 3.2.

### 3.1 Two-layer family-based graphical approach with four families

Consider families of hypotheses being divided into two layers based on their hierarchal relationships, with two families of hypotheses within each layer.

By using the notations introduced in Section 2.2, we define a two-layer family-based graphical approach through the following algorithm:

###### Algorithm 1

Step 1. Set . Test family using any FWER controlling procedure at critical value , and calculate .
Update the graph:

 L1→L1∖{F1j}; for k=1,2, let α2k→α2k+(α1j−e∗1j(A1j))g1j2k; g1l2k→{g1l2k,l≠j.0,otherwise.

If , go back to step 1; otherwise, go to next step.
Step 2. Test , , using any FWER controlling procedure at level and update the graph:

 L2→L2∖{F2k}.

If , go back to step 2; otherwise stop.

Algorithm 1 starts the test from the families in . Once is tested, the critical value of is updated based on the error rate function and the transition coefficient set ; moreover, itself is updated by deleting all the elements associated with . This procedure can be fully described by a graph displayed in Figure 5. For Algorithm 1, we have the following theorem.

###### Theorem 1

Under the conditions of the corresponding local procedures controlling the FWER within each family of hypotheses, the two-layer multiple testing procedure described in Algorithm 1 strongly controls the overall FWER at level .

For the proof of Theorem 1, see Appendix A.1.

### 3.2 General multi-layer family-based graphical approach

The aforementioned two-layer four-family case demonstrates the inherent nature of sequential testing of the family-based graphical approach. Now we generalize the graphical approach from two layers with two families of hypotheses in each layer to any layers with arbitrary number of families of hypotheses within each layer. The general multi-layer family-based graphical approach is defined through the following algorithm:

###### Algorithm 2

Step i . Test family using any FWER controlling procedure at level , and calculate .
Update the graph:

 Li→Li∖{Fij}; for k=i+1,⋯,n,l=1,⋯,lk, let αkl→αkl+(αij−e∗ij(Aij))gijkl; giskl→{giskl,s≠j.0,otherwise.

If , go back to step i; otherwise, go to next step.
Step n. Test . Use any FWER controlling procedure at level to test and update . If , go back to step ; otherwise stop.

For this general multi-layer family-based graphical approach, we have the following theorem.

###### Theorem 2

Under the conditions of the corresponding local procedures controlling the FWER within each family of hypotheses, the general multi-layer family - based graphical approach strongly controls the overall FWER at level .

For the proof of Theorem 2, see Appendix A.2.

###### Remark 2

Consider a specific problem of testing hierarchically ordered families of hypotheses, where there are layers, and for each layer , there is only one family . To deal with this multiple testing problem, consider a multi-layer family-based graphical approach, whose initial critical value for is if and otherwise; whose transition coefficients are given by , if and 0 otherwise. Regarding this graphical approach, we have the following several remarks.

1. If each family is tested using a local procedure controlling the FWER and satisfying separability condition, i.e., the error rate function of the local procedure is strictly smaller than when at least one hypothesis is not rejected within the family, then the multi-layer family-based graphical approach reduces to a specific parallel gatekeeping strategy, which is in turn equivalent to a general multistage gatekeeping procedure introduced by Dmitrienko et al. (2008). The examples of such local procedures include the conventional Bonferroni procedure, truncated Holm procedure, truncated fallback procedure, etc, see Dmitrienko et al. (2008).

2. If each family is tested using a FWER controlling local procedure for which the upper bound of its error rate function is given by for any , then the corresponding multi-layer graphical approach is equivalent to a specific serial gatekeeping strategy. The examples of such local procedures including the conventional Holm procedure and fixed sequence procedure, etc.

3. If each family has only one null hypothesis, then the multi-layer graphical approach reduces to the conventional fixed sequence procedure.

4. If some correlation information regarding the null -values within one family is known in advance, then there are more options for local procedures. For example, if the null -values in a family are known to be positive dependent or independent, then we can use the conventional or truncated Hochberg procedure as its local procedure.

## 4 Discussions

In this section, we use three cases shown in Bretz et al. (2009) to illustrate the efficiency and simplicity of our proposed family-based graphical approach as compared to the conventional hypothesis-based graphical approach in dealing with the problem of testing multiple families of hypotheses. These cases are respectively visualized in Figures 6-8, in which the original hypothesis-based graphs in Bretz et al. (2009) are displayed in the left side, and their corresponding family-based graphs are displayed in the right side.

###### Case 1

Consider a case in Figure 6 with four null hypotheses and . The left side of Figure 6 displays the hypothesis-based graphical procedure and its right side displays an equivalent family-based graphical procedure, where these four null hypotheses are grouped as families, and , and layers, and . The initial critical values allocated to the three families are respectively and , and the transition coefficient set is given by

 g1121=g1221=1; g2111=g2112=g1112=g1211=0.

The family-based procedure starts with testing (or ) using the Bonferroni method at level . If is rejected, the critical value of is transferred to as indicated by the transition coefficient on the directed edge from to , such that the critical value of is updated to . If is not rejected, no critical value is transferred to . Then, the procedure continues testing using the Bonferroni method at level . Once is rejected, its critical value will be added to . Otherwise, no critical value is transferred to . After testing both and in , if , we continue testing in using the Holm procedure at level . Through the whole testing process, we can see that our family-based graphical procedure is equivalent to the hypothesis-based graphical procedure displayed in Figure 6 (left). It is easy to observe from Figure 6 (right) that family-based graphical visualization describes the hierarchical relationship among the families of hypotheses more simply and clearly, as compared to hypothesis-based graphical visualization.

There are often some situations where the hypotheses in one family can be tested only if all the hypotheses in another family are rejected. If one uses the original hypothesis-based graphical approach to deal with such multiple testing problems, the generated graphs often include the edges with infinitesimally small weights, which are complex and difficult to communicate to non-statisticians. However, it is shown in the following that the infinitesimally small weights can be removed in the graphs by using our suggested family-based graphical approach.

###### Case 2

Consider a case of gatekeeping strategy involving testing three hypotheses and . Suppose only if both and are rejected, has the chance to be tested. The hypothesis-based graph of this gatekeeping strategy is shown in Figure 7 (left) with an edge associated with an infinitesimally small weight . When using the family-based graphical approach, the generated family-based graph is shown in Figure 7 (right), where the edge with the infinitesimally small weight is removed. As seen from Figure 7 (right), this method turns out to be a simple two layers, two families procedure with and , where and ; the initial critical values for and are and , respectively. Thus, the specific gatekeeping strategy can be described as follows: start testing using the conventional Holm procedure at level . If both hypotheses in are rejected, then its critical value are passed on to such that is tested at level . Otherwise, the test stops.

###### Case 3

Consider a more complicated case of gatekeeping strategy involving testing four hypotheses and . Suppose that and are of interest only if both and are rejected. The hypothesis-based graph of this gatekeeping strategy is shown Figure 8 (left) with the edges associated with infinitesimally small weights. As seen from Figure 8 (left), if both hypotheses and are rejected, the critical value is proportionally assigned to and according to the weights and such that receives and receives . When using the family-based graphical approach, the generated family-based graph is shown in Figure 8 (right). As seen from Figure 8 (right), this method turns out to be a simple two layers, two families procedure with and where and . The initial critical values for and are and , respectively. Thus, the specific procedure can be described as follows: perform the conventional Holm procedure for testing at level . If both and are rejected, its critical value is passed on to and unlike Case 2, we then perform a weighted Holm procedure with weights and for testing at . Otherwise, the test stops.

###### Remark 3

Through discussions of the above three cases, it is easy to see that when dealing with complex problems of testing multiple families of hypotheses, our proposed family-based graphical approach usually makes the whole testing process more clearly and easier to communicate to non-statisticians as compared to the conventional hypothesis-based graphical approach, which often involves with non-intuitive infinitesimally small weights .

## 5 A Clinical Trial Example

In this section, we consider a clinical trial example to illustrate the application of our proposed family-based graphical approach and compare its performance with that of the conventional hypothesis-based graphical approach.

We revisit the Type II diabetes clinical trial example in Dmitrienko et al. (2007). The trial compares three doses of an experimental drug (Doses L, M and H) versus placebo (Plac) with respect to one primary endpoint (P: Haemoglobin A1c), and two secondary endpoints (S1: Fasting serum glucose; S2: HDL cholesterol). The three endpoints will be examined at each of the three doses, so a total of nine null hypotheses will be formulated and grouped into three families, and . Family consists of three dose-placebo comparisons corresponding to the primary endpoint (P): H vs Plac (), M vs Plac () and L vs Plac (). Similarly, family consists of three dose-placebo comparisons corresponding to the secondary endpoint S1: H vs Plac (), M vs Plac () and L vs Plac () and family consists of three dose-placebo comparisons corresponding to the secondary endpoint S2: H vs Plac (), M vs Plac () and L vs Plac ().

The overall Type I error rate is pre-specified at and the raw -values for the nine null hypotheses are given in Table 1. In this example, we assume that the primary endpoint is more important than the secondary endpoints and , thus is always tested before testing and . For and , we consider two types of hierarchical relationships below and thus discuss two different gatekeeping strategies, Procedure 1 and 2. We visualize these two procedures by using the family-based and hypothesis-based graphical approaches, respectively.

Procedure 1. Suppose that the secondary endpoints and are equally important, thus and are grouped into the same layer; the dose-placebo comparisons within each family are ordered a priori (H vs. Plac through L vs. Plac). We choose the conventional fixed sequence procedure as local procedure for each family and the initial allocation of critical values for and are , and , respectively. Once is tested and all of its hypotheses are rejected, its critical value is equally allocated to and . Figure 9 (a) visualizes this gatekeeping strategy. We start testing at level ; all of three hypotheses in are rejected using the conventional fixed sequence procedure. Then, all of its local critical value is equally assigned to and and the updated critical values for and become . We continue to test and at level in any order using the conventional fixed sequence procedure; the resulting rejected hypotheses are , and . Finally, the testing results of Procedure 1 are summarized in Table 1. In addition, Figure 9 (b) provides a graphical visualization for Procedure 1 by using the hypothesis-based graphical approach. As seen from Figure 9, compared to the hypothesis-based graph, the family-based graph provides more clear and intuitive illustrations of the hierarchical relationships among the families of hypotheses.

Procedure 2. Suppose that the secondary endpoint is more important than , thus and are tested in a pre-defined order. Consider the gatekeeping strategy visualized in Figure 3 (b) for which the truncated Hochberg procedure with truncation parameter is used as local procedure for testing and ; the conventional Hochberg procedure is used for testing . The initial allocation of critical values for and are , and , respectively. We start testing at level ; all of three hypotheses in are rejected using the truncated Hochberg procedure; the updated critical values for and are and , respectively. We then test at level 0.037 using the same truncated Hochberg procedure; all of the three hypotheses in are rejected as well and its local critical value is transferred to ; the updated critical value of is . Finally, we test at level ; thus and are rejected. The testing results of Procedure 2 are also summarized in Table 1. We need to note that the conventional hypothesis-based graphical approach is not applicable to visualize Procedure 2.

## 6 Conclusions

In this paper, we developed a new family-based graphical approach for testing hierarchically ordered families of hypotheses. Theoretically we proved that the proposed graphical approach strongly controls the FWER at a pre-specified level. By using the proposed approach, we can easily develop and visualize various gatekeeping strategies. Specifically, when each layer has only one family, the proposed approach reduces to Dmitrienko et al. (2008)’s general multistage gatekeeping strategies.

Though case studies and a real clinical trial example, we showed that the proposed approach is simpler and more efficient as compared to Bretz et al. (2009)’s hypothesis-based graphical approach when dealing with the problem of testing multiple hierarchically ordered families. In addition, due to its family-based graphical visualization, our proposed approach will be easier to communicate to the non-statisticians than the original hypothesis-based graphical approach when dealing with increasingly complex hierarchical relationships among families of hypotheses.

## Appendix

### A.1    Proof of Theorem 1

Suppose that the family is tested at level , then we know that

 α∗1j=α1j, α∗2i=α2i+2∑j=1(α∗1j−e∗1j(A1j))g1j2i. (2)

For , define the event = {at least one true null hypothesis being rejected in at significant level }. Let denote the complement of . Thus,

 FWER = Pr{2⋃i=12⋃j=1Eij(α∗ij)} =

where is the complement set of .

Let denote the set of true null hypotheses in , and and denote the sets of rejections and acceptances, respectively.

First of all, let us consider the first term of the right side of (A.1    Proof of Theorem 1). Note that

 (4)

Here, the first inequality follows from the Bonferroni inequality and the second follows from the definition of the error rate function.

Next, we consider the second term of the right side of (A.1    Proof of Theorem 1). If is true, i.e., all of the rejected hypotheses in and are false, then and , which implies and , respectively. Then, by (2), we have

 α∗2i = α2i+2∑j=1(α∗1j−e∗1j(A1j))g1j2i ≤ α2i+2∑j=1(α∗1j−e∗1j(T1j))g1j2i.

Thus,

 (2⋂j=1¯¯¯¯E1j(α∗1j))⋂(2⋃j=1E2j(α∗2j)) ⊆ 2⋃j=1E2j(α2i+2∑j=1(α∗1j−e∗1j(T1j))g1j2i)

and then by the above result and the Bonferroni inequality,

 Pr{(2⋂j=1¯¯¯¯E1j(α∗1j))⋂(2⋃j=1E2j(α∗2j))} (5) ≤ Pr{2⋃i=1E2j(α2i+2∑j=1(α∗1j−e∗1j(T1j))g1j2i)} ≤ 2∑i=1Pr{E2j(α2i+2∑j=1(α∗1j−e∗1j(T1j))g1j2i)}.

Note that the fact that families are tested by FWER controlling local procedures and the probability inside the sum in the second inequality of (5) is exactly the FWER of the local procedures at level , thus the right side of (5) is bounded above by

 2∑i=1(α2i+2∑j=1(α∗1j−e∗1j(T1j))g1j2i) (6) = 2∑i=1α2i+2∑j=1(α1j−e∗1j(T1j))2∑i=1g1j2i ≤ 2∑i=1α2i+2∑j=1(α1j−e∗1j(T1j)) = 2∑i=1α2i+2∑j=1α1j−2∑j=1e∗1j(T1j) ≤ α−2∑j=1e∗1j(T1j).

The first inequality of (6) follows from the fact that for any .

Therefore, using (4)-(6) in (A.1    Proof of Theorem 1), we have

 FWER ≤ 2∑j=1e∗1j(T1j)+α−2∑j=1e∗1j(T1j)=α.

Thus, the desire result is proved.

### A.2     Proof of Theorem 2

Let denote the overall FWER of the multi-layer family-based procedure for which the initial critical values assigned to layers are . Within each layer , suppose that the initial critical values assigned to families are with . We show the following inequality by using induction,

 FWERn(α1,⋯,αn)≤n∑i=1li∑j=1αij≤α. (7)

If , through the proof of Theorem 1, we can get that .
Assume that (7) holds when , which is

 FWERk(α1,⋯,αk)≤k∑i=1li∑j=1αij≤α.

In the following, we show that (7) also holds for , i.e.,

 FWERk+1(α1,⋯,αk+1)≤k+1∑i=1αi≤α.

Define the events = {at least one true null being rejected among all the families in layer 1} and = {at least one true null being rejected among the families in all the layers except layer 1}. Then we have

 FWERk+1(α1,⋯,αk+1)=Pr{B1}+Pr{¯¯¯¯B1⋂B2}. (8)

Note that

 Pr{B1}≤l1∑j=1e∗1j(T1j), (9)

which follows from the definition of error rate function and the Bonferroni inequality. Let us consider the probability of the event below.

After testing all families in , the total significant level of layer will be transferred to the respective families from to . Specifically, for family with layer , its updated significant level is

 α∗ij=αij+l1∑l=1(α1l−e∗1l(A1l))g1lij.

Let denote the updated critical value for layer .

If is true, which means that no true null hypotheses are rejected in any families within , then it implies that type I error can only occur in the families of layers to . Thus,

 Pr{¯¯¯¯B1⋂B2}=FWERk(α∗2,⋯,α∗k+1). (10)

Note being true also implies that for any , which in turn implies due to the monotonicity condition of error rate function. Thus, by the induction assumption,

 FWERk(α∗2,⋯,α∗k+1)≤ k+1∑i=2li∑j=1α∗ij (11) = k+1∑i=2li∑j=1[αij+l1∑l=1(α1l−e∗1l(A1l))g1lij] = k+1∑i=2li∑j=1αij+l1∑l=1α1lk+1∑i=2li∑j=1g1lij−l1∑l=1e∗1l(A1l)k+1∑i=2li∑j=1g1lij ≤ k+1∑i=2li∑j=1αij+l1∑l=1α1l−l1∑l=1e∗1l(A1l) ≤ k+1∑i=1li∑j=1αij−l1∑j=1e∗1j(T1j).

The second inequality of (11) holds due to the condition of transition matrix that for any fixed . Therefore, by combining (8)-(11), we have

 FWERk+1(α1,⋯,αk+1)≤k+1∑i=1li∑j=1αij≤α.

This completes the induction, and show that (7) holds for any positive .

## References

• [1]
• [2] Bauer P., Rohmel J., Maurer W. and Hothorn L. (1998). Testing strategies in multi-dose experiments including active control. Statistics in Medicine 17, 2133–2146.
• [3] Bretz F., Maurer W., Brannath W. and Posch M. (2009). A graphical approach to sequentially rejective multiple test procedures. Statistics in Medicine 28, 586–604.
• [4] Burman C. F., Sonesson C. and Guilbaud O. (2009). A recycling framework for the construction of Bonferroni-based multiple tests. Statistics in Medicine 28, 739–761.
• [5] Dmitrienko A., Offen W. and Westfall P. H. (2003). Gatekeeping strategies for clinical trials that do not require all primary effects to be significant. Statistics in Medicine 22, 2387–2400.
• [6] Dmitrienko A. and Tamhane A. C. (2011). Mixtures of multiple testing procedures for gatekeeping applications in clinical trials. Statistics in Medicine 30, 1473–1488.
• [7] Dmitrienko A. and Tamhane A. C. (2013). General theory of mixture procedures for gatekeeping. Biometrical Journal 5, 311–320.
• [8] Dmitrienko A., Tamhane A. C., Liu L. and Wiens B. L. (2008). A note on tree gatekeeping procedures in clinical trials. Statistics in Medicine 27, 3446–3451.
• [9] Dmitrienko A., Tamhane A. C., Wang X. and Chen X. (2006). Stepwise gatekeeping procedures in clinical trial applications. Biometrical Journal 48, 984–991.
• [10] Dmitrienko A., Tamhane A. C. and Wiens B. L. (2008). General multistage gatekeeping procedures. Biometrical Journal 50, 667–677.
• [11] Dmitrienko A., Wiens B. L. and Tamhane A. C. (2007). Tree–structured gatekeeping tests in clinical trials with hierarchically ordered multiple objectives. Statistics in Medicine 26, 2465–2478.
• [12] Guilbaud O. (2007). Bonferroni parallel gatekeeping - transparent generalizations, adjusted p-values, and short direct proofs. Biometrical Journal 49, 917–927.
• [13] Kordzakhia G. and Dmitrienko A. (2013). Superchain procedures in clinical trials with multiple objectives. Statistics in Medicine 32, 486–508.
• [14]

Marcus, R., Peritz, E. and Gabriel, K. R. (1976). On closed testing procedures with special reference to ordered analysis of variance.

Biometrika 63, 655–660.
• [15] Maurer W. and Bretz F. (2014). A note on testing families of hypotheses using graphical procedures. Statistics in Medicine 30, 5340–5346.
• [16] Maurer W., Hothorn L. and Lehmacher W. (1995). Multiple comparisons in drug clinical trials and preclinical assays: a-priori ordered hypotheses. In Biometrie in der Chemisch-pharmazeutischen Industrie, Vollmar J(ed.). Fischer Verlag: Stuttgart, 6, 3–18.
• [17] Westfall P. H. and Krishen A. (2001). Optimally weighted, fixed-sequence, and gatekeeping multiple testing procedures. Journal of Statistical Planning and Inference 99, 25–40.
• [18]