Model-Robust Inference for Clinical Trials that Improve Precision by Stratified Randomization and Adjustment for Additional Baseline Variables

by   Bingkai Wang, et al.

We focus on estimating the average treatment effect in clinical trials that involve stratified randomization, which is commonly used. It is important to understand the large sample properties of estimators that adjust for stratum variables (those used in the randomization procedure) and additional baseline variables, since this can lead to substantial gains in precision and power. Surprisingly, to the best of our knowledge, this is an open problem. It was only recently that a simpler problem was solved by Bugni et al. (2018) for the case with no additional baseline variables, continuous outcomes, the analysis of covariance (ANCOVA) estimator, and no missing data. We generalize their results in three directions. First, in addition to continuous outcomes, we handle binary and time-to-event outcomes; this broadens the applicability of the results. Second, we allow adjustment for an additional, preplanned set of baseline variables, which can improve precision. Third, we handle missing outcomes under the missing at random assumption. We prove that a wide class of estimators is asymptotically normally distributed under stratified randomization and has equal or smaller asymptotic variance than under simple randomization. For each estimator in this class, we give a consistent variance estimator. This is important in order to fully capitalize on the combined precision gains from stratified randomization and adjustment for additional baseline variables. The above results also hold for the biased-coin covariate-adaptive design. We demonstrate our results using completed trial data sets of treatments for substance use disorder, where adjustment for additional baseline variables brings substantial variance reduction.



There are no comments yet.


page 1

page 2

page 3

page 4


Robustly leveraging the post-randomization information to improve precision in the analyses of randomized clinical trials

In randomized clinical trials, repeated measures of the outcome are rout...

On the robustness and precision of mixed-model analysis of covariance in cluster-randomized trials

In the analyses of cluster-randomized trials, a standard approach for co...

Regression analysis for covariate-adaptive randomization: A robust and efficient inference perspective

Linear regression is arguably the most fundamental statistical model; ho...

Optimizing Precision and Power by Machine Learning in Randomized Trials, with an Application to COVID-19

The rapid finding of effective therapeutics requires the efficient use o...

Combining Covariate Adjustment with Group Sequential and Information Adaptive Designs to Improve Randomized Trial Efficiency

In clinical trials, there is potential to improve precision and reduce t...

Inference under Covariate-Adaptive Randomization with Multiple Treatments

This paper studies inference in randomized controlled trials with covari...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Covariate-adaptive designs refer to randomization schemes that assign participants to study arms in a way that improves balance across arms in preselected strata of the baseline variables. E.g., balance on disease severity, a genetic marker, or another variable thought to be correlated with the primary outcome could be sought. The simplest form of covariate-adaptive randomization is stratified permuted block randomization (referred to as “stratified randomization” throughout, for conciseness).

Compared with simple randomization, covariate-adaptive randomization can be advantageous in minimizing imbalance and improving efficiency (Efron, 1971; Pocock and Simon, 1975; Wei, 1978). Due to these benefits, covariate-adaptive randomization has become a popular approach in clinical trials. According to a survey by Lin et al. (2015), 183 out of their sample of 224 randomized clinical trials published in 2014 in leading medical journals used some form of covariate-adaptive randomization. Stratified randomization (Zelen, 1974) was implemented by 70% of trials in this survey. Another method for covariate-adaptive randomization is the biased-coin design by Efron (1971). Other examples include Wei’s urn design (Wei, 1978) and rerandomization (Morgan and Rubin, 2012). Our results only apply to stratified randomization and the biased-coin design.

Concerns have been raised regarding how to perform valid statistical analyses at the end of trials that use covariate-adaptive randomization. Adjusting for stratification variables is recommended (Lachin et al., 1988; Kahan and Morris, 2012; EMA, 2015), but, according to a survey by Kahan and Morris (2012)

that sampled 65 published trials from major medical journals from March to May 2010, 41 implemented covariate-adaptive randomization (among which 29 used stratified randomization), but only 14 adjusted in the primary analysis for the variables used in the randomization procedure. Furthermore, there are few results on how to conduct the primary efficacy analysis in trials that use stratified randomization without making parametric model assumptions (which, if incorrect, can lead to bias).

Shao et al. (2010) proved the validity of the two-sample -test under the biased-coin design assuming the outcome follows a linear model. Shao and Yu (2013) extended this result to a case where the outcome follows a generalized linear model. Ma et al. (2015, 2018)

also assumed a linear model and derived the asymptotic distribution of the test statistic of the average treatment effect for the ANCOVA estimator and a class of covariate-adaptive designs.

Bugni et al. (2018), who use the superpopulation inference framework as done here, established the asymptotic theory of the unadjusted estimator and the ANCOVA estimator (with adjustment for strata only) of the average treatment effect for a wide range of covariate-adaptive designs; their results, like ours in Sections 5.1-5.2, are robust to arbitrary misspecification of the regression model used in the estimator. Ye and Shao (2019) derived asymptotics for log-rank and score tests in survival analysis under covariate-adaptive randomization, and their methods can handle adjustment for additional baseline variables; however, estimation was not addressed. Li and Ding (2019) established the asymptotic theory for the ANCOVA estimator under covariate-adaptive randomization in the randomization inference framework.

For trials using stratified randomization or biased-coin covariate-adaptive randomization, to the best of our knowledge, it was an open problem to determine (without making parametric regression model assumptions) the large sample properties of covariate-adjusted estimators that involve any of the following: binary outcomes or time-to-event outcomes, adjustment for baseline variables other than those in the randomization procedure, or missing data. This is the problem that we address. The main challenge is that treatment assignment is not independent across participants.

Under regularity conditions, we prove that a large class of estimators is asymptotically normally distributed in randomized trials that use stratified randomization or the biased-coin design, and we give a formula for computing their asymptotic variance. This class of estimators includes ANCOVA (referred to as ’ANCOVA I’ by Yang and Tsiatis, 2001

) for continuous outcomes, the standardized logistic regression estimator

(Scharfstein et al., 1999; Moore and van der Laan, 2009) for binary outcomes, the doubly-robust weighted least squares (DR-WLS) estimator from Marshall Joffe as described by (Robins et al., 2007) for continuous and binary outcomes, and estimators of the restricted mean survival time for time-to-event outcomes (Van der Laan et al., 2003; Díaz et al., 2019). The key theoretical underpinning of our results is the empirical process result from Shorack and Wellner (2009), which was insightfully used by Bugni et al. (2018) to prove their fundamental results for the special case described above.

As in the above referenced work, we assume that the randomization scheme and analysis method have been completely specified before the trial starts, as is typically required by regulators such as the European Medicines Agency and the U.S. Food and Drug Administration (FDA and EMA, 1998; EMA, 2015; FDA, 2019).

In the next section, we describe three motivating trial examples to which we apply our methods. In Section 3, we describe our setup, notation and assumptions. We present our main results in Section 4. In Section 5, we give example estimators to which our general results apply. Trial applications are provided in Section 6. Practical recommendations and future directions are discussed in Section 7.

2 Three completed randomized trials that used stratified randomization

2.1 Study of buprenorphine tapering schedule and illicit opioid use (NIDA-CTN-0003)

The trial of “Buprenorphine tapering schedule and illicit opioid use”, referred to as “NIDA-CTN-0003” in the National Drug Abuse Treatment Clinical Trials Network, is a phase-3 randomized trial completed in 2005 (Ling et al., 2009)

. The goal was to compare the effects of a short or long taper schedule after buprenorphine stabilization of patients with opioid use disorder. Patients were randomized into two arms: 28-day taper (control, 259 patients, 36% missing outcomes) and 7-day taper (treatment, 252 patients, 21% missing outcomes), stratified by maintenance dose (3 levels) measured at randomization. The outcome of interest is a binary indicator of whether a participant’s urine tested at the end of study is opioid-free. Missing outcomes are encoded as “NA” and not imputed. In addition to the stratification variable, we adjust for the following baseline variables: sex, opioid urine toxicology results, the Adjective Rating Scale for Withdrawal (ARSW), the Clinical Opiate Withdrawal Scale (COWS) and the Visual Analog Scale (VAS).

2.2 Prescription Opioid Addiction Treatment Study (NIDA-CTN-0030)

The “Prescription Opioid Addiction Treatment Study”, referred to as “NIDA-CTN-0030” is a phase-3 randomized trial completed in 2013 (Weiss et al., 2011). The goal is to determine whether adding individual drug counseling to the prescription of buprenorphine/naloxone will improve treatment outcome for patients with prescription opioid use disorder. Though this study adopted a 2-phase adaptive design, we focus on phase I, where patients were randomized into standard medical management (control, 330 patients, 10% missing outcomes) or standard medical management plus drug counseling (treatment, 335 patients, 13% missing outcomes). Randomization is stratified by the presence or absence of a history of heroin use and current chronic pain, resulting in 4 strata. The outcome of interest is the proportion of negative urine laboratory results among all tests. Among all 5 urine laboratory tests during the first 4 weeks of phase I, if a patient missed visits of more than 2 weeks, the outcome is regarded as missing. Baseline variables that are included in the analysis are strata, age, sex and urine laboratory result.

2.3 Study of internet-delivered treatment for substance abuse (NIDA-CTN-0044)

The phase-3 randomized trial “Internet-delivered treatment for substance abuse”, referred to as “NIDA-CTN-0044”, was completed in 2012 (Campbell et al., 2014). The goal was to evaluate the effectiveness of a web-delivered behavioral intervention, Therapeutic Education System (TES), in the treatment of substance abuse. Participants were randomly assigned to two arms: treatment as usual (control, 252 participants, 19% missing outcomes) and treatment as usual plus TES (treatment, 255 participants, 18% missing outcomes). Randomization was stratified by treatment site, patient’s primary substance of abuse (sitmulant or nonstimulant) and abstinence status at baseline. Since we do not have access to the treatment site variable, only patient’s primary substance of abuse and abstinence status at baseline were used to construct strata (4 levels) and treatment site was omitted from our analysis. After randomization, each participant was followed by 12 weeks with 2 urine laboratory tests per week. The outcome of interest is the proportion of negative urine lab results among all tests. If a participant missed visits of more than 6 weeks, the outcome is regarded as missing. We adjust for strata and additional baseline variables: age, sex and urine laboratory result.

In some cases, the outcomes in our analyses differ from the primary outcomes in the corresponding trials. The reason is that we wanted to use the same outcomes across trials for illustration. Our outcomes are all considered clinically meaningful in the field of substance use disorder treatments. All trial data were drawn from the National Institute on Drug Abuse Clinical Trials Network Data Share Website (CITE).

3 Definitions and assumptions

We focus on randomized trials that use stratified randomization (or the biased-coin design) with participants. For each participant , let denote the observed outcome, denote whether is observed () or not (), denote the treatment allocation ( if assigned to treatment and if assigned to control), and

denote a vector of baseline covariates.

We adopt the Neyman-Rubin causal model, which assumes , where is the potential outcome for participant were they in treatment group . Analogous to the definition of potential outcomes, we define “potential missingness” , which is the indicator of whether would be observed for participant were they assigned to study arm ; we assume that the observed (non)-missingness variable is connected to the potential missingness variables as follows: . Though potential missingness is not commonly used in the literature, it fits into the Neyman-Rubin causal model if missingness (e.g., whether a participant completes follow-up) is also an outcome of interest, and the outcome vector is defined as

. We emphasize that our asymptotic results involve neither the joint distribution of

nor , and that the targets of estimation are identifiable using only the observed data distribution. Denote participant ’s vector of potential outcomes as (some of which are not observed) and their observed vector as . We make the following assumptions on the distribution of potential outcome vectors :

Assumption 1.

(1) are independent samples from an unknown joint distribution on

(2) Missing (censoring) at random (MAR): for each arm , where denotes independence.

For stratified randomization or the biased-coin design, treatment allocation depends on strata, such as gender, location or race. Strata are represented by a single, categorical, stratification variable taking possible values, which is encoded using dummy variables in the covariate vector . For example, if strata are determined by 4 sites and a binary indicator of high disease severity, then has possible values. We denote as the stratification variable for participant and the set of all strata. The goal of stratified randomization or the biased-coin design is to achieve “balance” in each stratum; that is, the proportion of participants assigned to the treatment arm is targeted to a prespecified proportion , e.g. .

Stratified randomization uses permuted blocks to assign treatment. In each stratum, a randomly permuted block with fraction 1’s (representing treatment) and 0’s (representing control) is used for sequential allocation. When a block is exhausted, a new block is used.

The biased-coin design allocates participants sequentially by the following rule for :

with , e.g., .

For these two designs, it follows by construction (as proved by Examples 3.2 and 3.4 of Bugni et al., 2018) that since treatment allocation only depends on strata and exogenous randomness. This and the assumptions above imply for each that

Different from these two designs, simple randomization assigns treatments by independently flipping a coin for each participant with . This randomization scheme implies that for and for all .

Given the observed data for each participant , our goal is to estimate a population parameter , which is a contrast between the marginal distributions of and . For example, the parameter of interest can be the population average treatment effect, defined as , or it can be the restricted mean survival time when the outcome is a time-to-event.

We assume that the estimator of can be expressed as a solution to estimating equations of the form:


where is a column vector of known functions with dimension , is a column vector of parameters with dimension and is a column vector of nuisance parameters with dimension . We denote the solution to equation (1) as . Then is regarded as the estimator of . Many estimators used in clinical trials, including all estimators defined in Section 5, can be expressed as solutions to estimating equations (1) for an appropriately chosen estimating function .

We assume regularity conditions similar to the classical conditions that are used for proving consistency and asymptotic linearity of M-estimators for independent, identically distributed (i.i.d.) data, as given in Section 5.3 of van der Vaart (1998). One of the conditions is that the expectation of the estimating equations has a unique solution in ; in our framework, the expectation on the observed data vector is defined as being with respect to the distribution induced by first drawing from the joint distribution on potential outcomes (see Assumption 1), then drawing

as an independent Bernoulli draw with probability

of being , and lastly applying the consistency assumptions and . Equivalently, this condition is that


has a unique solution in , which is denoted as , where denotes expectation with respect to the joint distribution on the potential outcomes. The full set of regularity conditions is given in Appendix A.

Results in Section 5.3 of van der Vaart (1998) imply that under simple randomization, given Assumption 1 and the regularity conditions in Appendix A, converges in probability to and is asymptotically normally distributed with asymptotic variance that we denote by . We focus on determining what happens under stratified randomization or the biased-coin design, where our main result (Section 4) is that consistency and asymptotic normality still hold but the asymptotic variance may be smaller (and a consistent variance estimator is given).

4 Main result

Let be the estimator of corresponding to solving (1) for a given set of estimating functions with parameter , and be the solution to the corresponding expectation of estimating equations (2). Then given Assumption 1 and regularity conditions in Appendix A, under stratified randomization or the biased-coin design,

where is the first-row, first-column entry of the matrix , where

Furthermore, , where is the asymptotic variance under simple randomization. can be consistently estimated using the observed data distribution as described in Appendix B.

Theorem 4 implies that, given our setup and assumptions, whenever an estimator is consistent and asymptotically normally distributed under simple randomization, then it is consistent and asymptotically normally distributed under stratified randomization or the biased-coin design with equal or smaller asymptotic variance.

For the unadjusted estimator, our Theorem 4 is equivalent to Theorem 4.1 of Bugni et al. (2018) under stratified randomization or the biased-coin design. In the special case of continuous outcomes, if the ANCOVA estimator is used with , then Theorem 4 is equivalent to the result of Bugni et al. (2018) in section 4.2 under stratified randomization or the biased-coin design, though their results also handle other types of covariate-adaptive randomization. Our proof relies on a generalization of a key Lemma (Lemma C in Appendix C) from Bugni et al. (2018) that is based on the empirical process result of Shorack and Wellner (2009).

5 Examples of estimators

We give several examples of estimators that our theorem above applies to. For estimators defined in Sections 5.1-5.3, the parameter of interest, i.e. , is the average treatment effect defined as . In the first two subsections, we assume no missing data.

5.1 The ANCOVA estimator for continuous outcomes; no missing data

The ANCOVA estimator for

involves first fitting a linear regression working model


and then letting

be the ordinary least square estimate of

in model (3). The working model (3) can be arbitrarily misspecified. The ANCOVA estimator can be equivalently calculated by solving estimating equations (1) letting

where .

5.2 The standardized logistic regression estimator for binary outcomes; no missing data

The standardized logistic regression estimator is calculated by first fitting a working model:


where , and getting the maximum likelihood estimates (MLE) . Then we have

We do not assume the logistic regression model (4) to be correctly specified. The estimator can be equivalently calculated by solving estimating equations (1) letting

Another estimator is the logistic coefficient estimator, defined as . Unlike the standardized logistic regression estimator, the logistic coefficient estimator estimates a conditional effect and can lead to invalid inference if there is treatment effect heterogeneity. A detailed comparison of the two estimators can be found in Steingrimsson et al. (2016). We hence do not consider the logistic coefficient estimator in this paper.

5.3 The DR-WLS estimator for continuous and binary outcomes; outcomes missing at random

When missing outcome data present, we estimate by the DR-WLS estimator. The estimator is calculated by the following steps. (1) Fit a logistic regression model:


and get MLE of parameters . (2) Fit a generalized linear model


with weights using data with . Here the inverse link function is for continuous outcomes and for binary outcomes. Denote the model prediction for as . (3) The DR-WLS estimator is

The DR-WLS estimator can be equivalently calculated by solving estimating equations (1) with

The ANCOVA estimator and the standardized logistic regression estimator are special cases of the DR-WLS estimator. If there are no missing data, which means for and , then reduces to for continuous outcomes and to for binary outcomes.

5.4 Estimators for time-to-event outcomes

We give several examples of estimators for time-to-event outcomes to which our Theorem 4 applies, under our Assumption 1 and regularity conditions in Appendix A. All of these estimators can be represented as M-estimators, which is why our general approach applies to them.

In survival analysis, is the failure time and is the censoring time. One parameter of interest is the restricted mean survival time, defined as , where is a restriction time. Estimators of can be found in Van der Laan et al. (2003); Díaz et al. (2019).

5.5 Results for the ANCOVA estimator, standardized logistic regression estimator and DR-WLS estimator

In Appendix C, we prove several results that apply to the estimators Sections 5.1-5.3. In Corollary 1, we show that the ANCOVA estimator and standardized logistic regression estimator remain model-robust and the DR-WLS estimator is doubly-robust under stratified randomization or the biased-coin design.

The sandwich variance estimator of (typically used when data are assumed i.i.d.) is defined as

from Section 3.2 of Tsiatis (2007) or Theorem 5.21 of van der Vaart (1998); the sandwich variance estimator for is then the first-row first-column entry of .

The following corollary gives conditions for when the asymptotic distribution of the estimators in the subsections above are the same regardless of whether simple randomization, stratified randomization, or the biased-coin design is used. Under such conditions, the estimators and their corresponding sandwich variance estimators can be used to perform valid hypothesis tests and construct confidence intervals (without being conservative).

Corollary 1.

For the ANCOVA estimator, standardized logistic regression estimator and DR-WLS estimator, we assume their estimating equations satisfy Assumption 1 and regularity conditions in Appendix A. For the DR-WLS estimator, we further assume at least one of the two working models (5) and (6) is correctly specified. If (1) , or (2) the outcome regression model includes a treatment-by-strata interaction term, or (3) the outcome regression model is correctly specified, then under stratified randomization or the biased-coin design, then these estimators are consistent and asymptotically normally distributed with asymptotic variance . Furthermore, the sandwich variance estimator is consistent (for the true asymptotic variance).

6 Clinical trial applications

Table 1

summaries our data analyses for each application (NIDA-CTN-0003, NIDA-CTN-0030, NIDA-CTN-0044). All missing baseline values were imputed by the median for continuous variables and mode for binary or categorical variables. When implementing the ANCOVA estimator or standardized logistic regression estimator, all participants with missing outcomes were removed from the analysis. Estimates and standard errors are rounded to the nearest 0.01. “Confidence Interval” is abbreviated as “CI”. For all of the three trials, negative (positive) estimates are in the direction of clinical benefit (harm).

Estimator adjusting
for strata only
(95% CI)
Estimator adjusting
for all baseline variables
(95% CI)
(95% CI)
NIDA-CTN-0003 -0.11(-0.21, -0.01) -0.10(-0.19, -0.02) -0.10(-0.18, -0.02) 35%
NIDA-CTN-0030  0.02(-0.02,  0.05)  0.01(-0.02,  0.05)  0.01(-0.02,  0.05) 17%
NIDA-CTN-0044 -0.09(-0.14, -0.03) -0.09(-0.14, -0.03) -0.09(-0.15, -0.03) 2%
Table 1: Summary of clinical trial data analyses. The first column is the study ID. The second column gives the estimator with 95% confidence intervals (CI) adjusting for strata only. The third column gives the estimator with 95% confidence intervals (CI) adjusting for strata and additional baseline variables. The fourth column gives the DR-WLS estimator with 95% confidence intervals (CI) adjusting for strata and additional baseline variables. The fifth column shows the variance reduction due to adjustment for additional baseline variables comparing the second and third column.

For NIDA-CTN-0003, the standardized logistic regression estimator was used since the outcome was binary. If strata were adjusted only, the estimated absolute risk difference of getting negative urine lab result was with 95% CI ; if strata as well as additional baseline variables were adjusted, the point estimate became with 95% CI . Though the point estimates are similar, the variance reduction due to adjusting for additional baseline variables is 35%, indicating that researchers planning to perform adjustment for strata and additional baseline variables could achieve the same precision as adjusting for strata only with approximately 35% fewer participants. The DR-WLS estimator was with 95% CI , which had 1% wider 95% CI than the standardized logistic regression estimator with adjustment for strata and additional baseline variables. This precision loss is typical; that is the price paid for added robustness to model misspecification, since the DR-WLS estimator is consistent under missing at random when at least one of its working models is correct, while the other estimators are generally only consistent when outcomes are missing completely at random or when the outcome regression working model is correct.

NIDA-CTN-0030 and NIDA-CTN-0044 had continuous outcomes. Adjustment for additional baseline variables brings 17% and 2% variance reduction for NIDA-CTN-0030 and NIDA-CTN-0044, respectively, compared to adjusting only for the strata. When additional baseline variables are not strongly prognostic, such as in NIDA-CTN-0044, the variance reduction from additional baseline variables can be small.

7 Discussion

There is potential to substantially improve precision by adjusting for additional baseline variables than the ones used in the randomization procedure. For example, for NIDA-CTN-0003 and NIDA-CTN-0030, adjustment for additional baseline variables brings 35% and 17% variance reduction respectively. Our results show that there is no problem to implement this, and that many estimators used in randomized trials are consistent, asymptotically normal with variance that can be consistently estimated using our formula in Theorem 4. This asymptotic variance may be less than in the i.i.d. case, and our variance formula captures any added precision gain from stratified randomization (asymptotically).

The key to improving precision is adjusting for strongly prognostic baseline variables, if they exist. At the outset, one could use previous trials or observational data from a similar population to measure the prognostic value added by a set of baseline variables. This can be done by fitting two models, with one adjusting for the set of baseline variables and the other one not, and comparing their sandwich variance estimates.

Our asymptotics, as essentially all asymptotic results under the commonly used superpopulation inference framework, assume that the number of strata is fixed and the number of participants in each stratum goes to infinity. This may be a reasonable approximation when no stratum has a small number of participants. In our data examples, the smallest stratum has 49 participants. An area of future research is to consider cases where some strata have few participants.


This project was supported by a research award from Arnold Ventures. The content is solely the responsibility of the authors and does not necessarily represent the official views of Arnold Ventures. The information reported here results from secondary analyses of data from clinical trials conducted by the National Institute on Drug Abuse (NIDA). Specifically, data from NIDA–CTN-0003 (Suboxone (Buprenorphine/Naloxone) Taper: A Comparison of Two Schedules), NIDA-CTN-0030 (Prescription Opioid Addiction Treatment Study) and NIDA-CTN-0044 (Web-delivery of Evidence-Based, Psychosocial Treatment for Substance Use Disorders) were included. NIDA databases and information are available at (


  • Bugni et al. (2018) Bugni, F. A., Canay, I. A., and Shaikh, A. M. (2018). Inference under covariate-adaptive randomization. Journal of the American Statistical Association 113, 1784–1796.
  • Campbell et al. (2014) Campbell, A. N., Nunes, E. V., Matthews, A. G., Stitzer, M., Miele, G. M., Polsky, D., Turrigiano, E., Walters, S., McClure, E. A., Kyle, T. L., Wahle, A., Van Veldhuisen, P., Goldman, B., Babcock, D., Stabile, P. Q., Winhusen, T., and Ghitza, U. E. (2014). Internet-delivered treatment for substance abuse: A multisite randomized controlled trial. American Journal of Psychiatry 171, 683–690.
  • Díaz et al. (2019) Díaz, I., Colantuoni, E., Hanley, D. F., and Rosenblum, M. (2019). Improved precision in the analysis of randomized trials with survival outcomes, without assuming proportional hazards. Lifetime Data Analysis 25, 439–468.
  • Efron (1971) Efron, B. (1971). Forcing a sequential experiment to be balanced. Biometrika 58, 403–417.
  • EMA (2015) EMA (2015). European Medicines Agency Guideline on Adjustment for Baseline Covariates in Clinical Trials. Reference number EMA/CHMP/295050/2013. Committee for Medicinal Products for Human Use (CHMP).
  • FDA (2019) FDA (2019). Adjusting for Covariates in Randomized Clinical Trials for Drugs and Biologics with Continuous Outcomes Guidance for Industry. Docket Number:2019-08470. Center for Drug Evaluation and Research.
  • FDA and EMA (1998) FDA and EMA (1998). E9 statistical principles for clinical trials. U.S. Food and Drug Administration: CDER/CBER. European Medicines Agency: CPMP/ICH/363/96 .
  • Kahan and Morris (2012) Kahan, B. C. and Morris, T. P. (2012). Improper analysis of trials randomised using stratified blocks or minimisation. Statistics in Medicine 31, 328–340.
  • Lachin et al. (1988) Lachin, J., Matts, J., and Wei, L. (1988). Randomization in clinical trials: Conclusions and recommendations. Controlled Clinical Trials 9, 365 – 374.
  • Li and Ding (2019) Li, X. and Ding, P. (2019). Rerandomization and regression adjustment. arXiv page
  • Lin et al. (2015) Lin, Y., Zhu, M., and Su, Z. (2015). The pursuit of balance: An overview of covariate-adaptive randomization techniques in clinical trials. Contemporary Clinical Trials 45, 21 – 25. 10th Anniversary Special Issue.
  • Ling et al. (2009) Ling, W., Hillhouse, M., Domier, C., Doraimani, G., Hunter, J., Thomas, C., Jenkins, J., Hasson, A., Annon, J., Saxon, A., Selzer, J., Boverman, J., and Bilangi, R. (2009). Buprenorphine tapering schedule and illicit opioid use. Addiction 104, 256–265.
  • Ma et al. (2015) Ma, W., Hu, F., and Zhang, L. (2015). Testing hypotheses of covariate-adaptive randomized clinical trials. Journal of the American Statistical Association 110, 669–680.
  • Ma et al. (2018) Ma, W., Qin, Y., Li, Y., and Hu, F. (2018). Statistical inference of covariate-adjusted randomized experiments. arXiv page
  • Moore and van der Laan (2009) Moore, K. and van der Laan, M. (2009). Covariate adjustment in randomized trials with binary outcomes: Targeted maximum likelihood estimation. Statistics in Medicine 28, 39–64.
  • Morgan and Rubin (2012) Morgan, K. L. and Rubin, D. B. (2012). Rerandomization to improve covariate balance in experiments. The Annals of Statistics 40, 1263–1282.
  • Pocock and Simon (1975) Pocock, S. J. and Simon, R. (1975). Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial. Biometrics 31, 103–115.
  • Robins et al. (2007) Robins, J., Sued, M., Lei-Gomez, Q., and Rotnitzky, A. (2007). Comment: Performance of double-robust estimators when “inverse probability” weights are highly variable. Statist. Sci. 22, 544–559.
  • Scharfstein et al. (1999) Scharfstein, D. O., Rotnitzky, A., and Robins, J. M. (1999). Adjusting for nonignorable drop-out using semiparametric nonresponse models. Journal of the American Statistical Association 94, 1096–1120.
  • Shao and Yu (2013) Shao, J. and Yu, X. (2013). Validity of tests under covariate-adaptive biased coin randomization and generalized linear models. Biometrics 69, 960–969.
  • Shao et al. (2010) Shao, J., YU, X., and ZHONG, B. (2010). A theory for testing hypotheses under covariate-adaptive randomization. Biometrika 97, 347–360.
  • Shorack and Wellner (2009) Shorack, G. R. and Wellner, J. A. (2009). Empirical processes with applications to statistics. Society for Industrial and Applied Mathematics.
  • Steingrimsson et al. (2016) Steingrimsson, J., Hanley, D., and Rosenblum, M. (2016). Improving precision by adjusting for baseline variables in randomized trials with binary outcomes, without regression model assumptions. Contemporary Clinical Trials 54, 18–24.
  • Tsiatis (2007) Tsiatis, A. (2007). Semiparametric theory and missing data. Springer Science & Business Media.
  • Van der Laan et al. (2003) Van der Laan, M. J., Laan, M., and Robins, J. M. (2003). Unified methods for censored longitudinal data and causality. Springer Science & Business Media.
  • van der Vaart (1998) van der Vaart, A. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press.
  • Wei (1978) Wei, L. J. (1978). The adaptive biased coin design for sequential experiments. The Annals of Statistics 6, 92–100.
  • Weiss et al. (2011) Weiss, R. D., Potter, J. S., Fiellin, D. A., Byrne, M., Connery, H. S., Dickinson, W., Gardin, J., Griffin, M. L., Gourevitch, M. N., Haller, D. L., Hasson, A. L., Huang, Z., Jacobs, P., Kosinski, A. S., Lindblad, R., McCance-Katz, E. F., Provost, S. E., Selzer, J., Somoza, E. C., Sonne, S. C., and Ling, W. (2011). Adjunctive Counseling During Brief and Extended Buprenorphine-Naloxone Treatment for Prescription Opioid Dependence: A 2-Phase Randomized Controlled Trial. JAMA Psychiatry 68, 1238–1246.
  • Yang and Tsiatis (2001) Yang, L. and Tsiatis, A. (2001). Efficiency study of estimators for a treatment effect in a pretest-posttest trial. The American Statistician 55, 314–321.
  • Ye and Shao (2019) Ye, T. and Shao, J. (2019). Validity and robustness of tests in survival analysis under covariate-adaptive randomization. arXiv page
  • Zelen (1974) Zelen, M. (1974). The randomization and stratification of patients to clinical trials. Journal of Chronic Diseases 27, 365 – 375.

Appendix A Regularity conditions

The regularity conditions similar to Section 5.3 of van der Vaart (1998) are specifed below:

(1) , a compact set in .

(2) for any and .

(3) There exists a zero, denoted as , of the expectation of estimating equations

Furthermore, there is a such that is the only zero in its neighborhood .

(4) functions are twice continuously differentiable for every in the support of and dominated by an integrable function .

(5) There exist a and and integrable function , such that
element-wise for every in the support of and .


has finite second moment for


is invertible.

Appendix B Consistent estimator of and in Theorem 4

Consistency of and to and are proved in Appendix C.

Appendix C Proof of main results

In this section, we first present two lemmas that are critical for proving our main results. These two lemmas are adapted from results of Bugni et al. (2018) and the proofs of them are very similar to Bugni et al. (2018). Then we prove Theorem 4 based on Lemmas C and C. To prove Corollary 1, we first introduce and prove Corollary 2, which gives the asymptotic distribution of the DR-WLS estimator and partially indicates Corollary 1.

Given Assumption 1, let such that . Then under stratified randomization or the biased-coin design, .


See Lemma B.3 in the supplementary material of Bugni et al. (2018). The only difference is that we replace by and all deduction still holds. ∎

Given Assumption 1, let and such that for . Then under stratified randomization or the biased-coin design,



See Lemma B.1 and Lemma B.2 in the supplementary material of Bugni et al. (2018). The only difference is that we replace by and all deduction still holds. ∎

Proof of Theorem 4.

Using the fact that and , the estimating equations (1) can be re-written as

We first show that , where is a vector that solves

Regularity condition (3) in Appendix A implies that exists and is the only zero in its neighborhood. By Lemma C and regularity condition (2), we have

Combined with regularity conditions 2 (1) and (4), by Theorem 5.9 of Van der Vaart (1999), the above results imply converge in probability to .

We then show is asymptotically linear. By multivariate Taylor expasion,


for and is a random point on the line segment between and .

According to regularity condition (5), there exists a ball around such that is dominated by a function . Hence, if , then

which is bounded in probability by the law of large numbers. Furthermore, since

, we have

Denoting , by Lemma C and regularity condition (6), we have