Blinded continuous information monitoring of recurrent event endpoints with time trends in clinical trials

03/06/2019
by   Tobias Mütze, et al.
Novartis
0

Blinded sample size re-estimation and information monitoring based on blinded data has been suggested to mitigate risks due to planning uncertainties regarding nuisance parameters. Motivated by a randomized controlled trial in pediatric multiple sclerosis (MS), a continuous monitoring procedure for overdispersed count data was proposed recently. However, this procedure assumed constant event rates, an assumption often not met in practice. Here we extend the procedure to accommodate time trends in the event rates considering two blinded approaches: (a) the mixture approach modeling the number of events by a mixture of two negative binomial distributions, and (b) the lumping approach approximating the marginal distribution of the event counts by a negative binomial distribution. Through simulations the operating characteristics of the proposed procedures are investigated under decreasing event rates. We find that the type I error rate is not inflated relevantly by either of the monitoring procedures, with the exception of strong time dependencies where the procedure assuming constant rates exhibits some inflation. Furthermore, the procedure accommodating time trends has generally favorable power properties compared to the procedure based on constant rates which stops often too late. The proposed method is illustrated by the clinical trial in pediatric MS.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

11/22/2020

Sample size calculation for the Andersen-Gill model comparing rates of recurrent events

Recurrent events arise frequently in biomedical research, where the subj...
02/28/2018

Sample size for a non-inferiority clinical trial with time-to-event data in the presence of competing risks

The analysis and planning methods for competing risks model have been de...
09/21/2020

Sample Size Calculation for Cluster Randomized Trials with Zero-inflated Count Outcomes

Cluster randomized trails (CRT) have been widely employed in medical and...
11/02/2021

Comparison of Time-to-First-Event and Recurrent Event Methods in Multiple Sclerosis Trials

Suppression of disability progression is an important goal in the treatm...
03/26/2021

Incorporating delayed entry into the joint frailty model for recurrent events and a terminal event

In studies of recurrent events, joint modeling approaches are often need...
04/30/2021

Models Based on Exponential Interarrival Times for Single-Unusual-Event Count Data

At least one unusual event appears in some count datasets. It will lead ...
07/22/2020

Model-based simultaneous inference for multiple subgroups and multiple endpoints

Various methodological options exist on evaluating differences in both s...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Misspecification of nuisance parameters in the design of a clinical trial bears the risk of inconclusive results or wasteful use of resources when the variation in the data is larger or smaller than expected, respectively. To mitigate these risks, nuisance parameter based sample size re-estimation has been suggested and is commonly applied in clinical trials. Since re-estimation procedures based on blinded or non-comparative data generally lead to smaller bias and type I error rate inflation than unblinded procedures, and unblinded procedures bear the risk of compromising the integrity of the trial, these are preferred in regulatory guidance documents over unblinded approaches (European Medicines Agency (EMA), 2007; Food and Drug Administration (FDA), 2018; European Medicines Agency (EMA), 2007).

The variability of the sample size resulting from a nuisance parameter based sample size re-estimation can be reduced by repeated estimation of the nuisance parameters during the course of the study. The power and expected sample size of designs with repeated re-estimation are similar to those of designs with a single re-estimation (Friede and Miller, 2012). If taken to an extreme, repeated sample size re-estimation results in continuous monitoring with estimation of the nuisance parameters after every new data point. The trial is stopped once a sufficient level of information, i.e. precision of treatment effect estimate, is achieved. This is analogous to event-driven trials in which the trial is stopped once a prespecified number of events is observed.

Blinded continuous monitoring was considered for normally distributed data by

Friede and Miller (2012) and was recently transferred to the setting of recurrent event data by Friede et al. (2018)

. Whereas with normally distributed data only the variance needs to be monitored, with recurrent event data the information depends on the event rates, follow-up times, and overdispersion parameters, i.e., the between subject-variability. Without knowledge of the treatment groups the overall event rate can be estimated, which can then be split into group specific estimates under the assumption of a treatment effect hypothesized under the planning alternative. In randomized controlled trials, the follow-up times and the overdispersion parameters are usually assumed to be the same across treatment groups and therefore can fairly easily be estimated from blinded data pooled from all treatment groups. It could be shown that the application of such designs lead to shorter trial durations in comparison to traditional fixed designs while maintaining the power

(Friede, Häring and Schmidli, 2018). For clinical trials with recurrent event data, continuous information monitoring differs from a repeated sample size re-estimation in that continuous information monitoring does not necessary result in a change in sample size. Depending on the duration of the recruitment period, continuous information monitoring can be materialized in changes in the study duration while keeping the total sample size as initially planned.

The investigations by Friede et al. were motivated by a randomized controlled trial in pediatric multiple sclerosis where the annualized relapse rate was the primary endpoint (Chitnis, Arnold, Banwell et al., 2018). As this was the first large-scale double-blind, randomized controlled trial in this population, the design had to be based on adult data resulting in considerable uncertainty. Since the event rates were larger than assumed in the planning, the trial was stopped early based on results from blinded data looks. The analysis of this trial confirmed decreasing event rates over follow-up time, a trend previously also observed in a meta-analysis of adult data by Nicholas et al. (2012). The procedure by Friede et al. (2018), however, assumes constant event rates. Ignoring time trends in the event rates potentially may lead to biased estimates of the event rates as well as the overdispersion. Given the observed temporal trends in adult data, see Nicholas et al. (2012), Schneider et al. (2013a) had extended sample size re-estimation procedures for overdispersed count data, published by Friede and Schmidli (2010b, a), to account for these. To our knowledge, however, to date no blinded continuous monitoring procedure of the information for overdispersed count data is available accounting for time trends in the event rates. With this manuscript we want to close this gap.

The manuscript is organized as follows. In the following section some notation and underlying concepts are introduced before providing more background on the motivating clinical trial in pediatric multiple sclerosis in Section 3. In Section 4 blinded monitoring procedures are proposed and their operating characteristics are explored in a simulation study described in Section 5. The motivating example is revisited in Section 6. We close with a brief discussion.

2 Statistical model, hypothesis testing, and information

In this section, we define a non-homogeneous Poisson process with Gamma frailty as a model for recurrent events with time trends. Moreover, we define the statistical hypothesis of interest as well as an appropriate statistical test. We conclude with defining information for the introduced statistical model.

Denote the number of events subject in group has experienced up to study time . Study time is the time since randomization of a subject, i.e., the exposure time of a subject. We assume that the recurrent events of each subject stem from a non-homogeneous Poisson process with a log-linear baseline rate . Furthermore, proportionality of the rates between the treatment group and the control group is assumed. Therefore, the rate function for subject in group , conditional on the subject-specific frailty , is given by

(1)

Here, is the group indicator which is zero in the control group, , and one in the treatment group, . Moreover, the subject-specific frailty

is modeled as Gamma distributed, i.e.

, with the Gamma distribution parameterized such that has an expected value and a variance of 1 and , respectively. Hence, it follows that the number of events conditional on the frailty

are Poisson distributed, that is

with the cumulative rate function

with

The baseline cumulative rate function is denoted and given by . It is important to emphasize that and are not only functions of but also functions of , , and . However, we do not explicitly mention the parameters , , and in our notation of and for the sake of readability. Marginally, the number of events follows a negative binomial distribution with rate and shape parameter (Lawless, 1987a, b), that is

The expected value and the variance of the number of events are and , respectively. Thus, the variance increases in the shape parameter .

In this manuscript, we are interested in the superiority of an experimental treatment over the control. Under the assumption that smaller rates are better, superiority in the model above is given when the parameter is smaller than zero. Thus, the question of superiority of the treatment over control can be written as the statistical testing problem

To test the null hypothesis

, we employ a Wald test based on asymptotic maximum likelihood theory. Therefore, we discuss the maximum likelihood estimation of the parameters from the model above, , and the parameters’ asymptotic properties at first, followed by an introduction of the Wald test for . Let be the exposure time of subject in group at a calendar time , and let by the corresponding number of events. The study times, at which the events of a subject occurred, are denoted by . Then, according to Lawless (1987b), the likelihood function is given by

(2)

This results in the following log-likelihood function :

(3)

The maximum likelihood estimators at calendar time are calculated by maximizing the log likelihood function (2). This maximization can be performed by finding the root of the system of equations using the Newton-Raphson method. We list the partial derivatives in Appendix A. The maximum likelihood estimators are asymptotically normally distributed in the sense that

(4)

Here, with

the Fisher information matrix which is a function of the sample size, the unknown parameter vector, and the individual exposure times at calendar time

. For details, we refer to Appendix A. Let , we define the Wald statistic at calendar time by

(5)

with the plug-in estimator of the Fisher information matrix obtained by plugging in the maximum likelihood parameter estimators into the formula of the Fisher information matrix. From the asymptotic normality of the maximum likelihood estimator follows that the Wald statistic is asymptotically standard normally distributed at the boundary of the parameter space defined by the null hypothesis , that is . Therefore, the Wald test that rejects when is smaller than the

-quantile

of a standard normal distribution is an asymptotic level test for the null hypothesis .

We conclude this section by recapitulating the concept of statistical information. In general, the information for a treatment effect is the reciprocal of the variance of its estimator , that is (Jennison and Turnbull, 2000). The information measures the knowledge about the unknown treatment effect with larger values corresponding to a smaller uncertainty about the unknown treatment effect. For the non-homogeneous Poisson process model with Gamma frailty, the information at calendar time is given by

Analogously to the Fisher information matrix , the information is a function of the sample size, the parameter vector, and the individual exposure times. The information increases when the sample size , the individual exposure times, or the rates increase, and it decreases when the overdispersion parameter increases. Since the information is defined through the variance of the parameter estimate, the information is closely linked to the power of the previously introduced Wald test. For a parameter located in the parameter space of the alternative hypothesis , the power increases as the information increases. Moreover, for a given power and the significance level , the target information required for the Wald test to achieve a power for the parameter can be determined (Jennison and Turnbull, 2000; Tsiatis, 2006):

It is worth noting that the target information only depends on the significance level , the target power , and the assumed effect in the alternative . When planning a clinical trial, the sample size , the study duration, the accrual period, and the maximum individual exposure time are chosen such that the trial conveys the desired target information at the end of the trial for a parameter vector . Unless the exposure times are identical for all subjects at the end of the trial, no closed form expression for converting the target information into the sample size, study duration, etc exists (Schneider, Schmidli and Friede, 2013a).

3 Motivating example: Clinical trial in pediatric multiple sclerosis

Multiple sclerosis is a disease of the central nervous system that is in many patients characterized by periods of disease worsening, so-called relapses, followed by periods of recovery. Chitnis et al. (2018) published results of a clinical trial (ClinicalTrials.gov Identifier: NCT01892722) assessing the efficacy and safety of fingolimod versus interferon beta-1a in pediatric multiple sclerosis. The trial included 215 subjects which were randomized 1:1 between the two treatments. The primary endpoint was the annualized relapse rate determined by a negative binomial regression model of the number of relapses per subject.

The sample size of the clinical trial in pediatric multiple sclerosis was planned based on results of the TRANSFORMS clinical trial (ClinicalTrials.gov Identifier: NCT00340834) which assessed efficacy and safety of fingolimod in adults with relapsing-remitting multiple sclerosis, because no prior clinical trials in the pediatric population were available. In detail, the sample size of the clinical trial in pediatric multiple sclerosis was planned assuming a relative reduction of 50% in the annualized relapse rate from 0.36 for the interferon beta-1a arm to 0.18 for the fingolimod arm. With a fixed follow-up of two years per subject, a target power of 80%, a two-sided significance level of 5%, and an overdispersion parameter of , a total sample size of 190 subjects was planned. In a blinded assessment of the accumulated information during the trial, it was determined that the trial would be overpowered when conducted as initially planned. Based on this blinded information assessment and in agreement with the regulatory agencies, the clinical trial design was changed from a fixed duration to a flexible duration and stopped early.

In the following, we analyze the relapses observed in the clinical trial in pediatric multiple sclerosis using the non-homogeneous Poisson model with a log-linear time trend introduced in Section 2. Table 1 lists the parameter estimates for the standard negative binomial model, which does not account for time trends, and model (1) when applied to data from the pediatric multiple sclerosis trial. It is important to note that the primary analysis published by Chitnis et al. (2018) was a negative binomial regression adjusted for treatment, region, number of relapses in the previous two years before study enrollment, and pubertal status. For illustrative purposes, we keep the model simple and do not adjust for covariates.

Parameter Model without time trend Model with time trend
-
Cumulative rates
Table 1: Fit of negative binomial model (model without time trend) and model (1) (model with time trend) for data from the pediatric multiple sclerosis trial. The confidence intervals of the point estimates are shown in brackets.

The estimated relative reduction of the relapse rate is for both models listed in Table 1 which is in accordance with the results published by Chitnis et al. (2018). Comparing the models with and without time trend, the cumulative rates after two years, the estimated effect, and the estimates shape parameter are similar with relative differences of or less. Moreover, the estimated trend parameter is . Thus, the estimated relapse rate decreases by within two years from to in the interferon beta-1a and from to in the fingolimod group. We use this example in Section 5 to motivate the simulation setting and revisit the example in Section 6.

4 Blinded continuous information monitoring for recurrent events with time trends

4.1 Basic setting and notation

In this section we propose a procedure for blinded continuous information monitoring for the non-homogeneous Poisson model with a Gamma frailty introduced in Section 2. We start by outlining the concept of blinded continuous information monitoring and introduce notation related to the blinded sample. To begin with, the target information for the significance level , the power , and the effect of interest is determined before the trial. Then, the clinical trial design, that is the sample size, study duration, accrual period, etc, are determined based on guesstimates for the nuisance parameters , , and such that the target information is reached at the end of the trial under the premise of correct nuisance parameter guesstimates. However, instead of conducting the trial as initially designed, the information is monitored continuously in the calendar time and the trial is stopped at the first point in time at which the monitored information exceeds the target information . After the trial is stopped, the null hypothesis is tested using the fixed sample Wald test introduced in Section 2. It can occur that the information at the initially planned end of the trial is smaller than the target information. In this case, the trial could be continued beyond its initially planned duration to obtain the target information . Naturally, the question about how to continuously monitor the information arises. Here, we focus on information monitoring procedures that maintain blinding for reasons outlined in Section 1. From a statistical perspective, the main challenge in designs with blinded continuous information monitoring is to find an estimator for without knowing the treatment indicator of subjects. The information is a function of the nuisance parameters , , and , the treatment effect , and the individual exposure times . Since data is blinded, cannot be estimated and we propose a continuous information monitoring procedure for the planning alternative . Furthermore, the Fisher information matrix , which is used to calculate the information , explicitly depends on the subject-specific treatment group indicator. Therefore, this section is split into two parts. In Section 4.2 we propose two procedures for blinded estimation of the nuisance parameters. In Section 4.3 we illustrate how to estimate the Fisher information matrix, and therefore the information, without knowing the treatment group indicator.

For the blinded data, the notation introduced in Section 2 is changed by substituting the index by and the upper limit of index is changed from to . In detail, at calendar time , subject has an exposure time of and has experienced events at study times with . Let be the proportion of patients to be randomized into group which is assumed to be known.

4.2 Blinded estimation of nuisance parameters

4.2.1 Mixture approach

In a randomized trial with , a subject

from the blinded sample has with probability

been randomized to the treatment group and with probability to the control group. Thus, the cumulative number of events from a subject in the blinded sample follows a mixture of two negative binomial distributions, that is

(6)

Modeling the blinded sample through a mixture of two distributions has also been considered by Asendorf et al. (2017, 2018) in the context of longitudinal count data. Since we aim to only estimate the nuisance parameters, we replace the cumulative rate function in (6) under the assumption of a treatment effect by . From (6) it follows that the log-likelihood of the blinded sample is given by

(7)
(8)

Then, the maximum likelihood estimator of the nuisance parameters , , and at calendar time are defined by

(9)

It is important to emphasize that the mixture approach for blinded estimation of the nuisance parameters through (9

) differs from expectation-maximization (EM) algorithm-based procedures for blinded parameter estimation in that the EM algorithm-based procedures also estimate the treatment effect. The appropriateness of EM algorithm-based procedures has been controversially discussed in the past

(Friede and Kieser, 2002; Waksman, 2007; Cook, Bergeron, Boher and Liu, 2009; Schneider, Schmidli and Friede, 2013b; Cook, 2013).

4.2.2 Lumping approach

The lumping approach for blinded nuisance parameter estimation in a non-homogeneous Poisson model with Gamma frailty was proposed by Schneider et al. (2013a), who extended a previous proposal by Friede and Schmidli (2010b) to dependent event rates. The idea is to approximate the marginal distribution of the number of events by a negative binomial distribution, that is

(10)

with the cumulative rate function given by

As the last display shows, the cumulative rate function of the blinded sample is modeled by a mixture of the cumulative rate function from the treatment arm and the control arm with the weight of each part equal to the proportion of the sample size allocated to the respective arm. The blinded estimators for the nuisance parameters , , and at calendar time are obtained by maximizing the likelihood function of the blinded sample,

(11)

The likelihood function

of the blinded sample is the likelihood of a sample of independent negative binomial distributed random variables with rate parameter and dispersion parameter as in (

10).

The blinded maximum likelihood estimates (11) are not consistent under model (6). In particular, the parameter estimator overestimates the dispersion parameter as it accounts for the within-group and the between-group variability in model (10). For the lumping approach, we modeled the blinded data set by a negative binomial distribution. Technically, the assumption of a single negative binomial distribution is incorrect as the blinded sample follows a mixture of two negative binomial distribution as illustrated in Section 4.2.1. However, as previous research in the context of blinded sample size adjustments for negative binomial data has shown, the lumping approach is appropriate for blinded nuisance parameter estimation (Schneider, Schmidli and Friede, 2013a; Friede and Schmidli, 2010b; Schneider, Schmidli and Friede, 2013b).

4.3 Blinded estimation of information

The entries of the Fisher information matrix relevant for calculating information are sums of subject-specific values over with either or . Thus, the entries of have one of the following structures

(12)
(13)

For instance, the entry of which corresponds to the negative second partial derivative with respect to is the sum of

over and . The summands are not identical between the two groups, that is , under the alternative. Therefore, a blinded estimator of the Fisher information is not obtained by simply plugging in the blind estimator for the parameters , , and from Section 4.2 into the sums. However, when the exposure times in both treatment groups have the same distribution at a given calendar time , the exposure times from the blinded sample at calendar time are distributed as the exposure times in each treatment group. This results in the following approximation of the sums (12) and (13) by sums utilizing the exposure times from the blinded sample:

(14)
(15)

The sums on the right side of (14) and (15) only use information about the exposure times that is available in the blinded sample. Therefore, a blinded estimator of the Fisher information matrix at calendar time is obtained in two steps. Firstly, sums that make up the entries of the Fisher information matrix are rewritten as illustrated in (14) and (15). Secondly, the nuisance parameters , , and are replaced by blinded estimates, which we proposed in Section 4.2, and the effect is replaced by the planning alternative .

We denote the resulting blinded continuous information monitoring procedure by BCM-Trend–Lump if the nuisance parameters are estimated based on the lumping approach and by BCM-Trend–Mix if the nuisance parameters are estimated based on the mixture approach.

5 Simulation study

5.1 Purpose of simulation study and motivation of scenarios

In this section we assess the operating characteristic of the proposed blinded continuous information monitoring procedure. The focus is on two settings. Firstly, when a sponsor plans a clinical trial, major design aspects are the sample size and the corresponding power of the clinical trial under the assumed effect . Thus, an important requirement on blinded continuous information monitoring procedures is that the trial’s target power is maintained when the sample size planning assumptions are fulfilled. We refer to the setting, for which the planning assumptions are fulfilled, as Setting I. A sponsor’s motivation for conducting a clinical trial with a blinded continuous information monitoring procedure is to stop the trial early, that is to reduce the sample size or the study duration, while maintaining the target power when the initially planned trial is overpowered. Therefore, in Setting II, the clinical trials are overpowered due to misspecified planning assumptions. In both settings, we are interested in multiple performance measures. Firstly, we assess the type I error rate of the blinded continuous information monitoring procedure since a regulatory requirement of adaptive designs is to control the type I error rate (Food and Drug Administration (FDA), 2018; European Medicines Agency (EMA), 2007). Additionally, motivated by regulatory requirements, we evaluate the bias of the treatment effect estimator for designs with blinded continuous information monitoring. From the sponsor’s perspective, the power as well as the distributions of the study duration and the sample size of designs with information monitoring are of interest.

The parameters for the simulation study are motivated by the clinical trial in pediatric multiple sclerosis published by Chitnis et al. (2018), which was discussed in Section 3. In detail, we focus on a clinical trial with a planned maximum individual follow-up time of two years and a recruitment period of two years. The individual follow-up times cannot be extended beyond the initially planned maximum individual follow-up of two years. This results in a trial duration of four years for a fixed sample design. The target power is for a one-sided significance level of . Motivated by the observed cumulative rate presented in Section 3, the cumulative rate in the control group after two years is chosen to be , that is an annualized control rate of 0.75. The time trend parameter is chosen to be . A trend of is included as an extreme case as the rate at two years is less than 0.05. The parameter is chosen such that the cumulative rate is after two years. The rate and cumulative rate functions are illustrated in Figure 1. Under the null hypothesis, the rates are equal in both groups, that is and under the alternative hypothesis, we assume rate ratios of . The shape parameter is . For Setting I, which describes the scenario in which the planning assumptions are correct, the sample size is chosen to be the fixed design sample size required for a power of . For Setting II, the sample size is to describe an overpowered clinical trial. The fixed design sample size is for , and for . The trend parameter does not affect the sample size in the fixed design as the fixed design is planned with an identical follow-up time for each subject.

Figure 1: Rate and cumulative rate for the control group for different time trends.
Parameter Value
One-sided significance level 0.025
Target power 0.8
Maximum individual follow-up [years] 2
Recruitment period [years] 2
Study duration [years] 4
Start time of information monitoring After 0.5 years
Rate ratio under 1
Rate ratio under 0.5, 0.7
Cumulative rate in control group after two years
Time trend -0.25, -1, -1.5
Shape parameter 1.25
Sample size allocation
Setting I
Planned total sample size
Setting II
Planned total sample size
Table 2: Scenarios considered in the simulation study motivated by the clinical trial example from Section 3.

In addition to the blinded continuous information monitoring procedure proposed in Section 4, we include the blinded continuous information monitoring procedure of Friede et al. (2018) in our simulation study to assess its robustness concerning time trends in the rate and to compare its performance to the procedure proposed in Section 4. The monitoring procedure by Friede et al. (2018) will be summarized in Section 5.2.

5.2 Monitoring procedure by Friede et al.

Friede et al. (2018) proposed an information monitoring procedure for the negative binomial model with a constant rate , , and dispersion parameter , where the hypotheses of interest are also defined through the rate ratio, that is

Here, we utilize a maximum likelihood test based on the differences of log-rates to test the null hypothesis . Then, the information at calendar time is given by

Analogously to the information monitoring procedure proposed in Section 4, to monitor the information while maintaining the blinding, the parameters , , and are estimated blinded using either the lumping approach or the mixture approach. For details on the blinded parameter estimation, we refer to Section 4.1 in Friede et al. (2018). Denote the blinded estimators for the rate parameters , , and the dispersion parameter by , , and , respectively. Based on the blinded parameter estimators, the information is estimated through plug-in estimation by plugging in the following estimators:

We refer to this procedure by BCM-Const–Mix when the blinded parameter estimation utilizes the mixture approach and BCM-Const–Lump when the blinded parameter estimation utilizes the lumping approach. These procedures test the null hypothesis using the standard negative binomial model, that is the model without time trend. This is in contrast to the BCM-Trend procedures which perform the analysis based on (5) under the assumption of a time trend.

5.3 Operating characteristics for Setting I

In the following, we present the results of the simulation study for Setting I, that is the setting in which the planned sample size is equal to the sample size required in the fixed design to achieve the target power . The simulation results presented in this section are based on Monte Carlo replications.

Fixed Const–Lump Const–Mix Trend–Lump Trend–Mix
0.5 -0.25 0.0272 0.0276 0.0269 0.0273 0.0267
-1 0.0286 0.0285 0.0292 0.0277 0.0283
-1.5 0.0278 0.0287 0.0286 0.0279 0.0277
0.7 -0.25 0.0250 0.0260 0.0254 0.0259 0.0252
-1 0.0259 0.0267 0.0274 0.0264 0.0263
-1.5 0.0256 0.0260 0.0260 0.0255 0.0253
Table 3: Simulated type I error rate for parameters from Table 2 for the statistical test in the fixed design and the monitoring procedures. Here, denotes the effect used for information monitoring. The Monte Carlo error is for a simulated type I error rate of .

Table 3 lists the simulated type I error rate for parameters from Table 2. The simulated type I error rate for the monitoring procedures generally deviate less than two times the Monte Carlo error from the simulated type I error rate of the fixed sample design. For the scenarios with the larger effect , which corresponds to the scenarios with the smaller sample size, the type I error rate is inflated for the fixed sample design and so are the type I error rates of the monitoring procedures. This type I error rate inflation is due to the finite sample properties of the Wald test. In comparison, for the scenarios with a larger sample size, that is for the effect size , the type I error rate is closer to the nominal level for all procedures. The time trend has no noticeable effect on the type I error rates. The simulated type I error rates of the BCM-Const procedures are larger than the simulated type I error rates of the BCM-Trend procedures. However, the difference is not of practical relevance. Whether the difference between the simulated type I error rates of the BCM-Const and the BCM-Trend procedures is due to the different stopping times, see Table S1 in the supplementary material, or due to BCM-Const not explicitly accounting for the time trend is not evident from this simulation study. Table S2 in the supplementary material shows that the continuous information monitoring does not introduce any noticeable bias in the effect estimates at the end of the trial: among all scenarios and methods, the maximum simulated bias for estimators of is smaller than or equal to .

Fixed Const–Lump Const–Mix Trend–Lump Trend–Mix
0.5 -0.25 0.8095 0.7936 0.7784 0.7923 0.7764
-1 0.8134 0.7996 0.7867 0.7958 0.7809
-1.5 0.8086 0.7970 0.7850 0.7913 0.7749
0.7 -0.25 0.8026 0.7916 0.7873 0.7913 0.7867
-1 0.8040 0.7948 0.7908 0.7910 0.7862
-1.5 0.8056 0.7989 0.7952 0.7932 0.7885
Table 4: Simulated power for parameters from Table 2 for the statistical test in the fixed design and the designs with blinded continuous information monitoring procedure. The Monte Carlo error is for a simulated power of .

Table 4 shows that the blinded information monitoring procedures miss the target power by up to two percentage points. The monitoring procedures based on the lumping approach perform better than the procedures based on the mixture approach. This is due to the overestimation of the dispersion parameter in the lumping approaches which results in a later stopping times, see Table S4 in the supplementary material. Comparing the procedures BCM-Const and BCM-Trend for a given method of blinded parameter estimation (lumping or mixture approach), the time trends do not affect the power noticeably with differences in power of less than . The differences in power are associated with differences in the mean stopping time: a smaller power corresponds to an earlier mean stopping time. Monitoring procedures with a similar power have similar mean stopping times. The average stopping time of procedures based on the lumping approach is around 3.5 years which can be up to four months later compared to procedures based in the mixture approach.

In conclusion, when the planned sample size is identical to the sample size required in the fixed design to achieve the target power, the monitoring procedures based on the lumping approach have a power within one percentage point of the target power and can result in stopping the trial on average around six months earlier. The monitoring procedures do not result in a reduction of the sample size. For designs with a continuous monitoring, the mean stopping time under the null hypothesis is smaller than the mean stopping time under the alternative hypothesis . This is due to the larger overall event rate under the null hypothesis .

5.4 Operating characteristics for Setting II

In the following, we study the operating characteristics for the setting in which the planned sample size is larger that what would be required in the fixed sample design to achieve the target power. As before, the results are based on Monte Carlo replications. Table 5 lists the simulated type I error rates.

Fixed Const–Lump Const–Mix Trend–Lump Trend–Mix
0.5 -0.25 0.0281 0.0281 0.0277 0.0276 0.0276
-1 0.0275 0.0289 0.0287 0.0278 0.0282
-1.5 0.0252 0.0279 0.0287 0.0272 0.0272
0.7 -0.25 0.0252 0.0256 0.0255 0.0250 0.0251
-1 0.0248 0.0263 0.0264 0.0249 0.0256
-1.5 0.0253 0.0276 0.0275 0.0260 0.0263
Table 5: Simulated type I error rate for parameters from Table 2 for the statistical test in the fixed design and the monitoring procedures. Here, denotes the effect used for information monitoring. The Monte Carlo error is for a simulated type I error rate of .

For the scenarios with , the type I error rates in fixed sample designs and in designs with information monitoring are on average and , respectively. The small type I error rate inflation in these scenarios can be explained by the small sample size. In particular, the final sample size in the designs with information monitoring is on average smaller than in the fixed sample design by up to , depending on the effect size, monitoring procedure, and time trend. For scenarios with a larger effect size, , the statistical test in the fixed sample design controls the type I error rate. The simulated type I error rates for the BCM-Trend procedures are within a range of two times the Monte Carlo error of the simulated type I error rates in the fixed sample design. For BCM-Const, this only holds for the time trend . The simulated type I error rate for BCM-Const increases in the time trend resulting in a noticeable type I error rate inflation. The magnitude of the inflation is small with about percentage points.

Table 6 shows the simulated power.

Fixed Const–Lump Const–Mix Trend–Lump Trend–Mix
0.5 -0.25 0.9333 0.8259 0.8010 0.8222 0.7979
-1 0.9361 0.8389 0.8150 0.8260 0.8006
-1.5 0.9348 0.8420 0.8159 0.8244 0.7964
0.7 -0.25 0.9300 0.8120 0.8048 0.8082 0.8003
-1 0.9295 0.8243 0.8169 0.8097 0.8007
-1.5 0.9307 0.8306 0.8228 0.8080 0.8000
Table 6: Simulated power for parameters from Table 2 for the statistical test in the fixed design and the monitoring procedures. The Monte Carlo error is for a simulated power of .

Table 6 shows that the fixed design is overpowered as anticipated when increasing the required sample size by . The information monitoring procedures counteract this. However, the BCM-Const monitoring procedures do not fully mitigate the overpowering and yield a power that is up to four percentage points larger than the target power. Moreover, the overpowering increases as the time trend increases for the BCM-Const procedures. Applying the mixing approach for the blinded parameter estimation in the information monitoring results in a power closer to the target compared to applying the lumping approach. The monitoring procedures BCM-Trend, which explicitly account for a time trend, perform better for the scenarios presented in Table 6. In particular, BCM-Trend–Mix achieves the target power for the considered effect sizes and time trends. BCM-Trend–Lump is overpowered for the larger effect size. The mean stopping time is about two years or less, see Table S10 in the supplementary material. Thus, the monitoring procedures result in shorter clinical trials and also reduce the sample size compared to the fixed sample design.
Summarizing, information monitoring can prevent overpowering of clinical trials with recurrent events and time-depending rates. The information monitoring procedure BCM-Const proposed by Friede et al. (2018) can result in overpowered and longer running clinical trial when the event rates are time-dependent. For the considered scenarios, the monitoring procedures mitigate the overpowering due to a too large sample size by shortening the trial duration and as such the subjects’ follow-up times.

6 Motivating example revisited

In this section, we revisit the clinical trial in pediatric multiple sclerosis, introduced in Section 3. We illustrate the four methods for blinded continuous information monitoring discussed in this manuscript using data from the motivating example (Chitnis, Arnold, Banwell et al., 2018). Thereto, we estimate for a calendar time during the course of the trial the information based on the data which was available at said calendar time. For this blinded information estimation, we assume a treatment effect of , which corresponds to the planning alternative of the actual trial, and a treatment effect of , which corresponds to the treatment effect eventually observed (Chitnis, Arnold, Banwell et al., 2018). Table 7 lists the information level required for the considered effects when a power of or is targeted.

Treatment effect Target power Target information
0.2 3.03
4.06
0.5 16.34
21.87
Table 7: Target information for a one-sided significance level .

Figure 2 plots the blinded information estimated for four monitoring procedures versus the calendar time for data from the clinical trial in pediatric multiple sclerosis published by Chitnis et al. (2018).

Figure 2: Blinded information estimates versus the calendar time for data from a clinical trial in pediatric multiple sclerosis under assumed treatment effects of and .

Figure 2 shows that for an assumed effect of the information estimates from the monitoring procedures based on the lumping approach differ substantially from the information estimates of procedures using the mixture approach, in particular for dates close the end of the trial. Moreover, whether a monitoring procedure accounts for the time trend explicitly has no practical relevance; in other words, the procedures BCM-Trend and BCM-Const are almost identical for the lumping and the mixture approach, respectively. Since the target information for an effect of is between 3 and 4, depending on the target power, all four monitoring procedures would have stopped within a couple of months of each other almost two years before the actual end of the trial. For an effect of , the monitoring procedures using a mixture approach result in larger information estimates than the procedures based on the lumping approach, analogous to an assumed effect of . The difference between the monitoring procedure with the largest information estimate, BCM-Trend–Mixture, and the procedure with the smaller information estimate, BCM-Const–Lumping, is at most one. For the majority of the trial, this difference in the estimates has no relevant impact on the potential stopping time as the information increases by one every one to two months. However, towards the end of the trial, that is after October 2016 in the considered trial, the information curves become flat such that a difference of one can have a substantial effect of half a year or more. None of the considered methods in this section achieve the information required for target power of . It is important to emphasize that during the conduct of the clinical trial in pediatric multiple sclerosis, the information was achieved. For the information monitoring during the clinical trial covariates were included. Since including covariates reduces the variability, the information is larger when covariates are included in the blinded parameter estimation. Moreover, the planning assumption turned out to be conservative and the information required to achieve a target power of under the observed effect was reached more than 1.5 years before the end of the trial.

7 Discussion

Although the blinded monitoring procedure BCM-Const assuming constant rates previously proposed by Friede et al. (2018) turned out to be robust to some degree to time dependencies of the event rates, the procedures accounting for time trends were found to have more favorable operating characteristics. The proposed monitoring procedures generally did not inflate the type I error rate beyond levels observed in the fixed sample designs for the asymptotic Wald test in any practically relevant way and did not bias the treatment effect estimates in the final analysis. Both properties are important from a regulatory point of view. If the planning assumptions are correct (Setting 1), application of the information monitoring can still lead to some considerable time savings. When the planning assumptions were too conservative (Setting 2), the savings in terms of time and sample size are of course more pronounced. The differences between the lumping and the mixture approaches are generally small with a tendency for the lumping approach to stop trials later resulting in higher power.

Here we used the blinded continuous information monitoring to stop a trial early, if the information level specified in the planning had be reached. This could be due to higher event rates or less pronounced overdispersion than assumed, a scenario encountered in the trial in pediatric MS by Chitnis et al. (2018). If the blinded assessment of the nuisance parameters reveals that the planning assumptions were too optimistic, then the initially planned sample size or maximum follow-up time might be increased of course. We did not explore this here as this was not relevant to our motivating example, but the combination of blinded continuous monitoring with sample size or group-sequential design is of practical interest and will be explored by our group in the future.

The analyses of randomized controlled trials are often adjusted for stratification factors of the randomization or important prognostic variables. Although the procedures presented here could in principle be expanded to regression models adjusted for covariates, we did not investigate this any further. However, sample size re-estimation for covariate adjusted analyses with overdispersed count data has recently been considered by Zapf et al. (2019). As the motivating example shows, the inclusion of covariates in the analysis can lead to earlier completion of a trial. Moreover, the procedures presented here also apply to other time trends than the log-linear trend introduced in (1). For instance, one could apply period functions modeling seasonal trends which are common in asthma and chronic obstructive pulmonary disease (COPD).

Acknowledgments

The authors wish to thank Nikolaos Sfikas for helpful comments.

Author contributions

TM, SS, and TF conceived the concept of this study. TM conducted all numerical evaluations for the examples, and drafted the manuscript. TM ans SS conducted the numerical evaluations for the simulations. TM, SS, NB, HS, and TF critically reviewed and made substantial contributions to the manuscript. All authors commented on and approved the final manuscript.

Conflict of interest

TM ans HS are employed by Novartis Pharma AG, and own stocks thereof. TF provided consultancies to Novartis Pharma AG regarding sample size re-estimation strategies for the pediatric MS study that served as an example in this paper.

References

  • Asendorf et al. (2017) Asendorf, T., Henderson, R., Schmidli, H. and Friede, T. (2017) Modelling and sample size reestimation for longitudinal count data with incomplete follow up. Statistical Methods in Medical Research, 0, 1–17. URL: https://doi.org/10.1177/0962280217715664.
  • Asendorf et al. (2018) — (2018) Sample size re-estimation for clinical trials with longitudinal negative binomial counts including time trends. Statistics in Medicine.
  • Chitnis et al. (2018) Chitnis, T., Arnold, D. L., Banwell, B. et al. (2018) Trial of fingolimod versus interferon beta-1a in pediatric multiple sclerosis. New England Journal of Medicine, 379, 1017–1027.
  • Cook (2013) Cook, R. J. (2013) Authors’ redress on ’robustness of methods for blinded sample size re-estimation with overdispersed count data’. Statistics in Medicine, 32, 3955–3957.
  • Cook et al. (2009) Cook, R. J., Bergeron, P.-J., Boher, J.-M. and Liu, Y. (2009) Two-stage design of clinical trials involving recurrent events. Statistics in Medicine, 28, 2617–2638.
  • European Medicines Agency (EMA) (2007) European Medicines Agency (EMA) (2007) Reflection paper on methodological issues in confirmatory clinical trials planned with an adaptive design. https://www.ema.europa.eu/en/methodological-issues-confirmatory-clinical-trials-planned-adaptive-design. 2018-12-02.
  • Food and Drug Administration (FDA) (2018) Food and Drug Administration (FDA) (2018) Adaptive designs for clinical trials of drugs and biologics - guidance for industry. URL: https://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM201790.pdf. 2018-12-02.
  • Friede et al. (2018) Friede, T., Häring, D. A. and Schmidli, H. (2018) Blinded continuous monitoring in clinical trials with recurrent event endpoints. Pharmaceutical Statistics.
  • Friede and Kieser (2002) Friede, T. and Kieser, M. (2002) On the inappropriateness of an em algorithm based procedure for blinded sample size re-estimation. Statistics in Medicine, 21, 165–176.
  • Friede and Miller (2012) Friede, T. and Miller, F. (2012) Blinded continuous monitoring of nuisance parameters in clinical trials. Journal of the Royal Statistical Society: Series C (Applied Statistics), 61, 601–618.
  • Friede and Schmidli (2010a) Friede, T. and Schmidli, H. (2010a) Blinded sample size reestimation with count data: methods and applications in multiple sclerosis. Statistics in Medicine, 29, 1145–1156.
  • Friede and Schmidli (2010b) — (2010b) Blinded sample size reestimation with negative binomial counts in superiority and non-inferiority trials. Methods of Information in Medicine, 49, 618–624.
  • Jennison and Turnbull (2000) Jennison, C. and Turnbull, B. W. (2000) Group sequential methods with applications to clinical trials. Boca Raton, FL: Chapman-Hall/CRC.
  • Lawless (1987a) Lawless, J. (1987a) Negative binomial and mixed poisson regression. Canadian Journal of Statistics, 15, 209–225.
  • Lawless (1987b) — (1987b) Regression methods for poisson process data. Journal of the American Statistical Association, 82, 808–815.
  • Nicholas et al. (2012) Nicholas, R., Straube, S., Schmidli, H., Pfeiffer, S. and Friede, T. (2012) Time-patterns of annualized relapse rates in randomized placebo-controlled clinical trials in relapsing multiple sclerosis: A systematic review and meta-analysis. Multiple Sclerosis Journal, 18, 1290–1296.
  • Schneider et al. (2013a) Schneider, S., Schmidli, H. and Friede, T. (2013a) Blinded sample size re-estimation for recurrent event data with time trends. Statistics in Medicine, 32, 5448–5457.
  • Schneider et al. (2013b) — (2013b) Robustness of methods for blinded sample size re-estimation with overdispersed count data. Statistics in Medicine, 32, 3623–3635.
  • Tsiatis (2006) Tsiatis, A. A. (2006) Information-based monitoring of clinical trials. Statistics in Medicine, 25, 3236–3244.
  • Waksman (2007) Waksman, J. A. (2007) Assessment of the gould-shih procedure for sample size re-estimation. Pharmaceutical Statistics, 6, 53–65.
  • Zapf et al. (2019) Zapf, A., Asendorf, T., Anten, C. and et al (2019) Blinded sample size reestimation for negative binomial regression with baseline adjustment. In preparation, 0, 0.

Appendix A Maximum likelihood theory

For the sake of readability, we omit the index . Therefore, we denote the exposure time of subject receiving in group at calendar time by instead of , and we denote the number of event up to time by instead of .

Here, is the digamma function and is the first derivative of the digamma function. Next, we calculate the second partial derivatives. We start with the second partial derivatives with respect to .

With , it follows that all second partial derivatives with respect to are zero. We continue with the remaining second partial derivatives. We define .

It follows that the Fisher information matrix for the parameter vector is given by

with