Estimating the effect of PEG in ALS patients using observational data subject to censoring by death and missing outcomes

by   Pallavi Mishra-Kalyani, et al.

Though they may offer valuable patient and disease information that is impossible to study in a randomized trial, clinical disease registries also require special care and attention in causal inference. Registry data may be incomplete, inconsistent, and subject to confounding. In this paper we aim to address several analytical issues in estimating treatment effects that plague clinical registries such as the Emory amyotrophic lateral sclerosis (ALS) Clinic Registry. When attempting to assess the effect of a surgical insertion of a percutaneous endoscopic gastrostomy (PEG) tube on body mass index (BMI) using the data from the ALS Clinic Registry, one must combat issues of confounding, censoring by death, and missing outcome data that have not been addressed in previous studies of PEG. We propose a causal inference framework for estimating the survivor average causal effect (SACE) of PEG, which incorporates a model for generalized propensity scores to correct for confounding by pre-treatment variables, a model for principal stratification to account for censoring by death, and a model for the missing data mechanism. Applying the proposed framework to the ALS Clinic Registry Data, our analysis shows that PEG has a positive SACE on BMI at month 18 post-baseline; our results likely offer more definitive answers regarding the effect of PEG than previous studies of PEG.



There are no comments yet.


page 1

page 2

page 3

page 4


Implementation of Tripartite Estimands Using Adherence Causal Estimators Under the Causal Inference Framework

Intercurrent events (ICEs) and missing values are inevitable in clinical...

Combining Experimental and Observational Data for Identification and Estimation of Long-Term Causal Effects

We consider the task of identifying and estimating the causal effect of ...

Probabilistic Analysis of Balancing Scores for Causal Inference

Propensity scores are often used for stratification of treatment and con...

An introduction to causal reasoning in health analytics

A data science task can be deemed as making sense of the data and/or tes...

Why Interpretable Causal Inference is Important for High-Stakes Decision Making for Critically Ill Patients and How To Do It

Many fundamental problems affecting the care of critically ill patients ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

While observational data from disease registries can be used to assess treatment effects, they are often fraught with analytical challenges. This is particularly true of a disease with high disability and mortality rates as issues of unmeasured data often occur due to a patient’s absence or death. Our work is motivated by the analysis of data from an amyotrophic lateral sclerosis (ALS) Clinic Registry, aimed at estimating the effect of a palliative treatment, percutaneous endoscopic gastrostomy (PEG) on body mass index (BMI), a measure of nutritional management and quality of life. ALS is a neurodegenerative disorder with a very poor prognosis (Gelinas and Miller, 2000; Procaccini and Nemergut, 2008). Currently there is no known cure for ALS and existing treatment procedures for ALS patients are mostly palliative; there has been substantial interest in assessing effects of these palliative procedures on quality of life (Miller et al., 2009).

Dysphagia, or difficulty in swallowing, affects almost all patients with ALS, and subsequently along with muscle atrophy and hypermetabolism causes malnutrition amongst the ALS patient population. Nutritional management is key in disease management and palliative care (Desport et al., 2006; Muscaritoli et al., 2012). PEG, a surgically inserted tube providing enteral nutrition, is generally considered when an ALS patient’s nutritional status deteriorates and weight loss is greater than 10% of the baseline weight (Goyal and Mozaffar, 2014; Park et al., 1992), though patients may elect not to receive PEG for other reasons. While some studies have shown that PEG tube insertion offers some benefits in stabilizing or even increasing body weight, the overall results regarding the effect of PEG on weight or BMI are inconclusive, and in particular the long term benefits of the procedure are uncertain (Kasarskis et al., 1996; Katzberg and Benatar, 2011; Mazzini et al., 1995). One challenge for assessing the effect of PEG in ALS patients is that randomized studies are not ethically feasible, since PEG placement has long been recommended as standard of care in ALS clinics for patients with nutritional compromise (Miller et al., 2009). As a result, the effect of PEG has been evaluated using observational data, and previous studies did not adequately address analytical issues including confounding and missing data that were commonly encountered. In addition, the existing studies had small to moderate sample sizes, limiting the generalizability of their findings.

The Emory ALS Clinic Registry has several notable strengths, making it well-suited for assessing the effect of PEG. It includes more than 1000 patients with long-term follow-up and a wide range of relevant clinical variables such as BMI, vital capacity (VC), forced vital capacity (FVC), negative inspiratory force (NIF), and ALS Functional Rating Scale Revised score (ALSFRS-R). In particular, ALSFRS-R is a measure of disease progression and is not always collected in large ALS registries. However, the data present several analytical challenges to estimating a causal effect of PEG. First, insertion of PEG is not randomized, which may lead to confounding by pre-treatment variables. Second, due to the fatal and fast-progressing nature of ALS outcome measurements may be “censored by death,” i.e, a patient dies before the outcome of interest can be measured (Zhang and Rubin, 2003; Rubin, 2006). Third, for those who are not “censored by death,” the outcome may be missing due to various reasons.

In the context of Rubin’s Causal Model, principal stratification, first described in detail in Frangakis and Rubin (2002), offers a framework for causal inference in the presence of a confounding post-treatment variable, such as censoring by death. Subsequently, Zhang and Rubin (2003) extended this approach to cases where a post-treatment variable such as survival or graduation “censors” the outcome of interest. Zhang et al. (2009) further outlined specific modeling approaches for the identification of survivor average causal effect (SACE) within the principal stratification framework. These existing works in this area, however, analyzed data from randomized studies, assuring complete collection of outcome data and, more importantly, avoiding the issues of confounding, an advantage not guaranteed in our retrospective clinic registry data.

More recently, Frumento et al. (2012) addressed post-treatment variables and missing outcome data simultaneously within a principal stratification framework in a randomized study. However, the Emory ALS Clinic Registry data introduce some unique challenges that were not present in the analysis of Frumento et al. (2012) and require new analytical approaches. In particular, our study treatment is not randomized and includes a component of time to treatment, leading to potential confounding by pre-treatment variables. Furthermore, both treatment and outcome measurement are subject to censoring by death in our study. Finally, missing outcome data in our study are due to the observational nature of the study, and have different implications compared to the missing outcomes in Frumento et al. (2012). Therefore, while some of the same tools are utilized in our work, the proposed framework addresses important new issues.

It is particularly important to address potential confounding by pre-treatment variables in a principal stratification model, because Frangakis and Rubin (2002) defined principal stratification in a randomized experiment. When such confounding may be present, either as residual confounding in a randomized clinical trial or due to observational data, Schwartz et al. (2012) showed that the resulting principal effect estimate is likely to be biased. Of the many methods for handling confounding, a popular choice is the propensity score introduced by Rosenbaum and Rubin (1983). The propensity score provides a means of balancing pre-treatment variables across treatment groups, a result that would otherwise be guaranteed if a randomized study design is used. Though propensity scores were initially introduced for balancing treatment assignment groups when treatment is binary, other authors have extended these methods to non-binary treatment assignment models, such as generalized propensity score methods (Imai and Van Dyk, 2004; Hirano and Imbens, 2004). These methods allow for balancing pre-treatment variables when treatment assignment is categorical or continuous. A recent work by Jo and Stuart (2009) used propensity scores within a principal stratification framework, not for the removal of confounding as the study was randomized, but instead to predict principal strata membership in a matched analysis, again different from our setting. Furthermore, propensity score methods for non-binary treatment assignment have not yet been employed for conditional ignorability in a principal stratification framework when outcomes may be missing.

To account for aforementioned complicating analytical issues, we develop a causal inference approach for estimating the survivor average causal effect (SACE) of PEG. Building on the principal stratification framework to account for censoring by death (i.e., confounding by a post-treatment variable), our approach uses generalized propensity scores to correct for confounding by pre-treatment variables and it also accounts for missing outcome data by incorporating a model for the missing data mechanism. Although it is developed for the specific application, our approach can be applied to analysis of other observational studies that are subject to similar issues. The remainder of the article is organized as follows. First, we introduce the data and notation in Section 2. We present our framework for causal inference in Section 3

and a Bayesian inference procedure in Section 

4. In Section 5, we apply the proposed approach to the Emory ALS data. We conclude this paper with some discussion remarks in Section 6.

2 Data Structure in the ALS Registry

In the analysis of the ALS Clinic Registry, our goal is to assess the effect of PEG on BMI (denoted by ) at months post-baseline; without loss of generality we assume that . We denote by the set of patient baseline characteristics and by the survival time post-baseline confirmed by Social Security database records, noting that all patients in our data set died. The surgical insertion of a PEG tube may be administered at any time post-baseline given patient choice. Let be the binary indicator such that if a patient received PEG prior to the time of outcome measurement and if otherwise; let denote time to treatment. It follows that patients untreated prior to have values and if they did not survive until , or and if they survived past . For those who died prior to , is undefined and is known as “censored by death.” For those who are alive at , is defined but may be missing due to other reasons and let be the binary indicator such that if is missing and if is observed. Of note, for patients with who were “censored by death,” both and are undefined, denoted by “*”, following the notation of Zhang and Rubin (2003). This extends the sample space for to {} and for to {}.

3 Causal Inference Framework

As earlier defined, the binary indicator for treatment from baseline until (time of outcome measurement) is and the outcome of interest is . A post-treatment indicator of survival past the is . Following the Rubin Causal Model (Holland, 1986), we define potential outcomes and under the two treatment options and 1. For simplicity, we suppress in what follows if it does not create ambiguity.

Figure 1 illustrates the potential outcomes in the Emory ALS data. For example, patients who are treated prior to time of outcome measurement () may follow one of two paths: survival until ( = 1) or death prior to ( = 0). Those individuals who survive may have observed , as designated by the darkest shaded box in Figure 1, or have missing as designated in the medium shaded box. Patients who die before have both and undefined, i.e., censored by death, as designated by the lightest shaded box.

Figure 1: Potential outcomes for all patients given survival and treatment status.

In the presence of censoring by death, we use post treatment survival status to define four potential principal strata () by , namely, , , , and . Individuals who are in the stratum are those who would be alive at time regardless of treatment. Those individuals who are in the stratum would be alive if they received the treatment but would not be alive if they did not and those who are in the stratum would not be alive if they receive the treatment, but would be alive if they did not. Finally, individuals in the are those who would die prior to time regardless of treatment. A causal effect of treatment is only estimable in strata where individuals would be alive and therefore have a measurable outcome, regardless of treatment status. As such, the causal effect is estimable in the stratum only. Specifically, the Survivor Average Causal Effect (SACE) is defined as the difference in between treated and untreated individuals in the stratum.

However, we do not get to observe since we only know the survival status given the observed treatment status. Based on , there are four observed groups denoted by : , , , and . Each of these observed groups is a mixture of the principal strata. Specifically, (1,1) is a mixture of individuals from the and strata, (1,0) is comprised of individuals from the and strata, (0,1) is comprised of individuals from the and strata, and (0,0) is comprised of individuals from the and strata. In addition, based on , we have six observed groups denoted by (), namely, (1,1,1), (1,1,0), (1,0,-), (0,1,1), (0,1,0), (0,0,-); see Table LABEL:t3tab:like. The two original observed groups with , namely, (1,1) and (0,1), are each divided into two subgroups based on whether is observed or missing. The groups with do not need to be divided further, since is undefined.

3.1 Assumptions

In our causal inference framework, we adopt two standard assumptions, the Stable Unit Treatment Value Assumption (SUTVA) as defined by Cox in 1958 and further elaborated in Rubin (1980) and the Strong Ignorability of Treatment Assignment (Rosenbaum and Rubin, 1983).

Assumption 1

[SUTVA] The potential outcomes of any unit is independent of the treatment status of the other units given .

Assumption 2

[Strong Ignorability of Treatment Assignment] and (or equivalently ) are independent of given .

Regarding the missing data mechanism for , we consider two assumptions, a latent ignorability (Frangakis and Rubin, 1999) and a standard ignorability (Little and Rubin, 2002).

Assumption 3

[Latent Ignorable Missingness] does not depend on the missing values or parameters of the data distribution conditional on and ; particularly, is independent of given and .

Assumption 4

[Ignorable Missingness] does not depend on the missing values or parameters of the data distribution conditional on ; particularly, is independent of given .

Since is not completely known, the latent ignorable missingness in Assumption 3 is not ignorable and it is necessary to model in the modeling framework.

3.2 Data Likelihood

We define the complete data as for , which under Assumption 1 constitute independent and identically distributed observations. Accordingly, the observed data are represented as , where is the set of feasible principal strata that an individual may belong to given and .

We first relate the distribution of the complete data to the distribution of the potential outcomes under Assumptions 1-3. We have

where the first equality is due to the fact that is redundant given and , the second equality is due to Assumption 3, and the third equality is due to Assumption 2. In addition, , since is fully determined by and . From Assumption 2, we have . Using these facts and Assumption 1, the complete data likelihood can be written as


where models the potential outcomes, models the missing data mechanism, models the principal strata, and models the treatment assignment. Without loss of clarity, we define if is missing; in other words, only observed contributes to the first part of the complete data likelihood (1). Assuming the parameters in are distinct from the other parameters and for , we can drop from the complete data likelihood.

The observed data likelihood is considerably more complicated, since it involves mixture distributions since we only observe and do not observe . As shown in Section 4.4, a data augmentation algorithm allows us to work with the complete data likelihood directly and hence avoid the complicated observed data likelihood.

Since the treatment of interest includes a time-to-intervention component , we use generalized propensity scores in the spirit of Imai and Van Dyk (2004), Lu (2005) and Hirano and Imbens (2004) to further accommodate this unique feature. Specifically, we model which may be censored by using a Cox proportional hazards (PH) model and use its linear predictor as the generalized propensity score, denoted by . Along the lines of Imai and Van Dyk (2004) and Lu (2005), it follows from Assumptions 1 and 2 that is independent of and of conditional on . Then the complete data likelihood (1) can be rewritten as


where is dropped.

Alternatively, under Assumptions 1, 2 and 4, i.e., replacing the latent ignorable missingness with the ignorable missingness, the complete data likelihood (2) can be simplified to

Assuming that its parameters are distinct from parameters in other models, can be dropped from the complete data likelihood, leading to


In Section 5, we analyze the ALS data using both (2) and (3) under the latent ignorable missingness and the ignorable missingness, respectively.

4 Modeling Framework

In this section, we specify the model for each component of the complete data likelihood (2) and then describe a Bayesian inference approach under Assumptions 1-3. The corresponding inference approach under Assumptions 1-2 and 4 is simpler and can be derived along similar lines.

4.1 Model for Potential Outcomes

For in (2), we assume that within each principal stratum

have a normal distribution,

, with parameters and covariates that may differ by strata. Also, individual likelihoods for are dependent on potential strata , and are represented by . Specifically, for , where includes the column for intercept, , , and . and include intercept and but not and as patients with an observed outcome in each of these strata are either all treated or all untreated respectively. The regression coefficient for in the stratum represents SACE and is of primary interest. Inclusion of allows us to assess the relationship between and time to treatment within those who receive the treatment.

4.2 Model for Principal Stratification

The probabilities of the four principal strata, denoted by

, are modeled using a multinomial logit model with a set of covariates denoted by

including . he probability of a patient being in principal stratum is given in equation (4), and individual probabilities of principal strata are represented by . As in any multinomial logit model, one category must be selected as a reference group.


Of the assumptions available for sharpening the bounds of effect estimates in the principal stratification framework, a monotonicity assumption is often used, i.e., one of the principal strata, say the group does not exist Zhang and Rubin (2003). While this monotonicity assumption may be plausible if treatment is always beneficial on patient survival, it is also possible that a subpopulation of weak patients would be not be able to withstand treatment or recovery and may die shortly after treatment, but could stay alive if left untreated, when evaluating an intrusive treatment. For such a data analysis, it may be appropriate to allow all four principal strata to exist and to relax the assumption of monotonicity. In the ALS data, the SACE for PEG is estimated both with and without the monotonicity assumption to assess its impact.

4.3 Model for Missing Data Mechanism

As discussed before, for patients with is undefined, so it is only valid to model for those individuals in the stratum with treatment, the stratum without treatment, the stratum with treatment, and the stratum without treatment. The model for is presented in (5).


where is a set of covariates associated with the missing data mechanism. The individual probability of missingness is dependent on strata and treatment value and is represented by . When Assumption 4 is used instead of Assumption 3, this model is not needed.

- -
- -
- -
- -
- -
- -
Table 1: Individual likelihood by observed groups if is known

4.4 Bayesian Inference

Given the models specified in Section 4.1-4.3, the individual likelihoods for all possible combinations of , , and are given in Table LABEL:t3tab:like, where each cell value is the likelihood of the complete data if is known. The conditional probability of given the observed data is the ratio of each cell to the total of that row. Rows and are included in Table LABEL:t3tab:like for a comprehensive understanding of the possible combinations of treatment and survival, but individuals who fall into these groups do not have outcome data and hence do not contribute to the observed data likelihood. As such, these individuals will only contribute to the model for the probability of principal strata, which is reflected in the observed data likelihood in equation (6).


Prior distributions for the specified parameters in the observed data likelihood should be chosen carefully, with thought to distributions that may be informative, proper, and conjugate where appropriate. For this analysis, a conjugate multivariate normal prior distribution is assumed for ,

. Similarly, a conjugate prior distribution is assumed for

, . We choose diffused priors for and .

Though the principal stratum of each individual is unknown, the observed treatment and survival groups may be used to inform imputation of the principal stratum assignments. Applying this idea, we use the Data Augmentation (DA) algorithm

(Tanner and Wong, 1987) for posterior computation, in which information about the latent groups, namely, , is imputed and subsequently the posterior parameters distributions are simulated.

The DA algorithm includes two iterative and alternating steps to allow for posterior inference. The first step, the Imputation or I-step, imputes the value of the principal stratum for each individual. This is accomplished by using the parameter values , , , and (for approach 2 only) from the current approximation of posterior (from the th iteration) to generate by using the conditional probabilities that are given by taking the ratio of each cell value to the row total in Table LABEL:t3tab:like. These conditional probabilities,

, are used in a Bernoulli distribution that imputes individual memberships to one of the two principal strata that correspond with the observed group

. More specifically, at the (+1) iteration, each individual has a probability of being in a stratum that depends on their observed values (, , , , ).

The P-step, or Posterior step, is then employed by using the imputed complete data set, and the parameters can be updated to by sampling from the full conditional distributions of each parameter based on the complete data likelihood. Either a Gibbs Sampler or a Metropolis-Hastings (MH) Algorithm can be employed for sampling. The details of the DA algorithm are provided in Web Appendices A and B.

If we use Assumption 4 instead of Assumption 3, we can drop all terms involving from Table LABEL:t3tab:like and use the complete data likelihood (3), leading to a simpler version of the aforementioned DA algorithm.

5 Analysis of ALS Clinic Data

We apply the proposed approach to analysis of the data from patients who visited the Emory ALS clinic at least once between 1997 and 2014. Patients are excluded from the analysis for not having any follow up clinic visits from baseline to outcome measurement at month 18 post-baseline or for having long survival times (5 years post-baseline), resulting in a cohort of for our analyses. It has been noted that ALS patients with long survival times are likely different from the other ALS patients (Mateen et al., 2010), which may not be fully explained by clinical variables collected. Characteristics measured at baseline for each individual include sex, race, age, site of disease onset, months from symptom onset to baseline, months from diagnosis to baseline, body mass index (BMI), vital capacity (VC), forced vital capacity (FVC), negative inspiratory force (NIF), and ALS Functional Rating Scale Revised score (ALSFRS-R). Continuous variables, namely age, months from symptom onset to baseline, months from diagnosis to baseline BMI, VC, FVC, NIF, and ALSFRS-R, are standardized before being included as covariates for the propensity score model and the modesls for principal stratification model, missing data mechanism and outcomes. Since our focus is to handle missing outcome values, we impute missing values in covariates prior to subsequent analysis using the proposed approach.

Of the 275 treated patients, 32% or 89 individuals are alive 18 months from baseline, while of the 540 untreated individuals, 34% or 186 individuals are alive at this time-point (p = 0.606). Baseline measurements of BMI, FVC, and VC, as well as age at diagnosis, survival until , and proportion of white versus other races are not significantly different between treated and untreated groups. However, clinically relevant differences between treatment groups do exist. NIF and ALSFRS-R scores at baseline are significantly higher for treated versus untreated individuals at a level of . Also the proportions of female individuals and of individuals with spinal disease are significantly differnt for treated individuals than untreated individuals (p = 0.040 and p 0.001). Most interestingly, the naïve analysis that compares the outcome of interest, BMI at 18 months post-baseline, across treatment groups indicates that treated individuals have a BMI that is 1 unit lower than the untreated group, and this difference in significant (p = 0.021). These results suggest that the effect of PEG may be confounded by variables that are different between treated and untreated groups, hence it is important to use the generalized propensity score. A table of patient characteristics by treatment status is available in the online Appendix (Web Table 1).

The 275 survivors at are significantly different from 540 non-surviving individuals by nearly every clinical comparison, with only demographic characteristics of sex and race being similar amongst the two groups, suggesting the need to adjust for the post-treatment variable, censoring by death. Additionally there are significant differences amongst patients with observed outcome and those with missing outcome data in, for example, BMI, FVC, VC, NIF, and ALS FRS-R score at baseline. indicating that the missing completely at random (MCAR) assumption is unrealistic and necessitates the use of the proposed approach for handling missing data. Web Tables 2 and 3 providing support for these statements are available in the online Appendix.

The SACE of PEG treatment is estimated using three modeling approaches. In addition to the two approaches using Assumptions 3 (latent ignorable missingness) and 4 (ignorable missingness) as described in Section 4.3, we consider a third approach that includes only individuals with an outcome measurement at , analogous to a complete-case analysis that essentially assumes that the outcomes are MCAR. We also apply the same set of methods without the propensity score adjustment as well as with a monotonicity assumption, i.e., removal of the stratum (Zhang and Rubin, 2003). Finally, to allow for flexibility, polynomial splines are used for incorporating the generalized propensity score in the models with denoting the degree of the polynomial spline. For all analyses, the MCMC algorithm is run for a total of 5,000 iterations, with a burn-in period of 3000 iterations.

To identify the model that best fits the data, we use the deviance information criterion (DIC) (Spiegelhalter et al., 2002). Models with smaller values of DIC are preferred to models with larger values, and the best fit is indicated by a minimum DIC. Table 2 presents DIC for each of the modeling approaches under different assumptions for missing data mechanism with various degrees of polynomial splines for the propensity score. The models for all three approaches with have the lowest DIC, and therefore offer the best fit for each modelling approach. Particularly, the model assuming latent ignorable missingness with has the lowest DIC among all models.

0 1 2 3 4 5
Assumption 3 1702.55 1683.98 1668.20 1777.01 1658.00 2034.26
Assumption 4 1769.40 1722.59 1727.66 1733.74 1714.54 1719.51
MCAR 2715.46 2658.06 2655.16 2654.74 2653.52 3200.87
Table 2: Deviance Information Criterion (DIC) for analysis under Assumption 3 (latent ignorable missingness), Assumption 4 (ignorable missingness) and MCAR, as , the degree of polynomial splines for propensity scores, varies.
Effect Estimates
Latent Ignorable Ignorable
Missingness Missingness MCAR
With PS Time to PEG 0.17 0.16 0.14
(-0.06,0.39) (-0.05,0.38) (-0.01,0.29)
Binary Indicator 3.09 3.17 2.28
(0.01,5.83) (0.14,6.24) (0.79,3.77)
Without PS Time to PEG 0.24 0.24 0.20
(0.03,0.46) (0.03,0.45) (0.03,0.35)
Binary Indicator 0.45 0.56 -0.05
(-2.42,3.45) (-2.19,3.31) (-1.37,1.23)
Table 3:

Estimates of SACE of PEG treatment (with 95% credible intervals) on BMI at 18 months post-baseline Assumption 3 (latent ignorable missingness), Assumption 4 (ignorable missingness) and MCAR with and without adjustment for propensity scores (PS) (N=815).

The top two rows of Table 3 present a comparison of the different modeling approaches when propensity scores are included. Specifically, the results from the models with are presented for each modelling approach, as these have been chosen best fitting models according to DIC. In contrast to the naïve analysis in Web Table 1 that shows a negative effect of PEG, analyses under both Assumption 3 and Assumption 4 show that there is a significant, positive SACE of PEG surgery on the outcome of BMI at 18 months post-baseline. Holding all else constant and assuming PEG insertion at or immediately after baseline, BMI at 18 months increases by about 3 units for those individuals who receive treatment when compared to those who are not treated among survivors. The effect of time to treatment is also positive but not statistically significant. Noting that a higher time to PEG () leads to a shorter time from PEG to the time of outcome measurement, these results seem to suggest that the positive effect of PEG on BMI decreases gradually after the treatment, i.e., the possibility for a waning long-term effect. Additionally, the approach assuming MCAR indicates that the estimated effects of time to treatment and binary treatment are smaller in magnitude than those estimated from the two approaches under Assumptions 3 and 4, respectively.

The lower half of Table 3 presents the results from the three approaches without adjustment for propensity scores in the estimation of the SACE of PEG treatment, showing some changes in the magnitude of the effect estimates. On the one hand, removing the propensity score from the models in Approaches 1 and 2, the time to treatment effect remains positive and is now significant. On the other hand, in these same models, the coefficient estimate for the binary treatment indicator is drastically smaller in magnitude and no longer statistically significant. In the model that assumes MCAR and does not use propensity scores, the coefficient estimate for the binary treatment indicator changes direction and is non-significant.

In addition, the results, not provided, are not sensitive to the assumption of monotonicity. This may be due to the small proportion of individuals in strata when all four strata are considered. When monotonicity is not assumed, most patients are in the and strata, with a and strata comprising less than 20% of the individuals in total. It is conceivable then that reallocating such a small proportion of individuals when removing the stratum would not substantially change the effect estimates of the other strata.

Overall, the results in our analysis demonstrate a positive SACE of PEG, in contrast to inconclusive results in magnitude, direction, and significance from previous observational studies of PEG in association with weight or BMI. Given that our methodology is developed to account for the many complicating issues of an observational study that were not addressed in previous studies, it is likely that the positive effect of PEG estimated in this analysis is more definitive. As data collection is still ongoing in the ALS Clinic Registry, the proposed approach can be applied to this registry in the future, potentially further validating our current findings.

6 Discussion

Our current work represents the very first attempt to leverage the rich data in the Emory ALS Clinic Registry for assessing the effect of PEG in a large ALS population. Our proposed framework for causal inference addresses several challenges in the analysis of this data that have not been accounted for in previous studies of PEG. Our results show a positive effect of PEG placement on survivors that may be waning over time after insertion. Our approach can be directly applied or extended to evaluate effects of PEG or other palliative procedures such as the non-invasive positive pressure ventilation (NIPPV) (Miller et al., 2009) on additional outcomes that are of interest to clinicians including quality of life and disease progression.

One limitation of our analysis is that we do not have data on the usage of PEG tube after its insertion. Of note, patients with PEG may elect not to use PEG for nutrition delivery from time to time, thus the level of usage, similar to treatment adherence, could have an impact on patient outcomes such as BMI. Additionally, while the use of propensity scores within the principal stratification framework allows for the estimation of an unbiased principal effect using observational data, the reliability of removing bias via propensity score adjustment is predicated on the assumption of strongly ignorable treatment assignment, which means there must be no unmeasured confounders. Partly based on our experience from this work, a future prospective observational study, currently in the design and planning stage, will collect usage data for palliative procedures such as PEG and NIPPV as well as additional potential confounders that are not collected or validated in standard clinical settings. In addition, future applications of the proposed methods to other data with richer measurements of confounders should further demonstrate the reduction of confounding.

Making adjustments for missing outcome data in the context of causal inference typically requires strong assumptions about the ignorability of the missing data mechanism and creativity in the modeling framework. In our data application, the similar results under Assumptions 3 and 4 indicate that the findings are not sensitive to the latent ignorable assumption for this data, though we cannot empirically test this assumption. When in doubt, further stratification by missingness of outcome data and within the principal stratification framework may allow for additional flexibility in the assumptions imposed on the missing data mechanism.

Future consideration may also be given to jointly modeling the propensity score with the outcome model, missing data model, and principal strata model in a Bayesian framework, similar in spirit with Zigler et al. (2013) and Zigler and Dominici (2014). This would allow the quantities observed by in each of these three models to affect the posterior of propensity score in each MCMC iteration. While this could provide a more robust propensity score adjustment, Zigler et al. (2013) showed that the feedback between model stages in joint modeling can cause biased causal effect estimates if individual covariates are not also adjusted for in the outcome model. This bias should be accounted for if one uses joint modeling of the four models of outcome, missing data mechanism, propensity scores, and principal strata.


  • Desport et al. (2006) Desport, J.-C., Torny, F., Lacoste, M., Preux, P.-M., and Couratier, P. (2006). Hypermetabolism in als: correlations with clinical and paraclinical parameters. Neurodegenerative Diseases, 2(3-4):202–207.
  • Frangakis and Rubin (1999) Frangakis, C. E. and Rubin, D. B. (1999). Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes. Biometrika, 86(2):365–379.
  • Frangakis and Rubin (2002) Frangakis, C. E. and Rubin, D. B. (2002). Principal stratification in causal inference. Biometrics, 58(1):21–29.
  • Frumento et al. (2012) Frumento, P., Mealli, F., Pacini, B., and Rubin, D. B. (2012). Evaluating the effect of training on wages in the presence of noncompliance, nonemployment, and missing outcome data. Journal of the American Statistical Association, 107(498):450–466.
  • Gelinas and Miller (2000) Gelinas, D. F. and Miller, R. G. (2000). A treatable disease: a guide to the management of amyotrophic lateral sclerosis. Amyotrophic lateral sclerosis., pages 405–21.
  • Goyal and Mozaffar (2014) Goyal, N. A. and Mozaffar, T. (2014). Respiratory and nutritional support in amyotrophic lateral sclerosis. Current treatment options in neurology, 16(2):1–12.
  • Hirano and Imbens (2004) Hirano, K. and Imbens, G. W. (2004). The propensity score with continuous treatments. Applied Bayesian modeling and causal inference from incomplete-data perspectives, pages 73–84.
  • Holland (1986) Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396):945–960.
  • Imai and Van Dyk (2004) Imai, K. and Van Dyk, D. A. (2004). Causal inference with general treatment regimes: Generalizing the propensity score. Journal of the American Statistical Association, 99(467):854–866.
  • Jo and Stuart (2009) Jo, B. and Stuart, E. A. (2009). On the use of propensity scores in principal causal effect estimation. Statistics in medicine, 28(23):2857–2875.
  • Kasarskis et al. (1996) Kasarskis, E. J., Berryman, S., Vanderleest, J. G., Schneider, A. R., and McClain, C. (1996). Nutritional status of patients with amyotrophic lateral sclerosis: relation to the proximity of death. The American journal of clinical nutrition, 63(1):130–137.
  • Katzberg and Benatar (2011) Katzberg, H. D. and Benatar, M. (2011).

    Enteral tube feeding for amyotrophic lateral sclerosis/motor neuron disease.

    Cochrane Database Syst Rev, 1.
  • Little and Rubin (2002) Little, R. and Rubin, D. (2002). Statistical Analysis with Missing Data. John Wiley: New York.
  • Lu (2005) Lu, B. (2005). Propensity score matching with time-dependent covariates. Biometrics, 61(3):721–728.
  • Mateen et al. (2010) Mateen, F. J., Carone, M., and Sorenson, E. J. (2010). Patients who survive 5 years or more with als in olmsted county, 1925–2004. Journal of Neurology, Neurosurgery & Psychiatry.
  • Mazzini et al. (1995) Mazzini, L., Corra, T., Zaccala, M., Mora, G., Del Piano, M., and Galante, M. (1995). Percutaneous endoscopic gastrostomy and enteral nutrition in amyotrophic lateral sclerosis. Journal of neurology, 242(10):695–698.
  • Miller et al. (2009) Miller, R. G., Jackson, C. E., Kasarskis, E. J., England, J., Forshew, D., Johnston, W., Kalra, S., Katz, J., Mitsumoto, H., Rosenfeld, J., Shoesmith, C., Strong, M., and Woolley, S. (2009). Practice parameter update: The care of the patient with amyotrophic lateral sclerosis: Drug, nutritional, and respiratory therapies (an evidence-based review) report of the quality standards subcommittee of the american academy of neurology. Neurology, 73(15):1218–1226.
  • Muscaritoli et al. (2012) Muscaritoli, M., Kushta, I., Molfino, A., Inghilleri, M., Sabatelli, M., and Rossi Fanelli, F. (2012). Nutritional and metabolic support in patients with amyotrophic lateral sclerosis. Nutrition, 28(10):959–966.
  • Park et al. (1992) Park, R., Allison, M., Lang, J. e., Spence, E., Morris, A., Danesh, B., Russell, R., and Mills, P. (1992). Randomised comparison of percutaneous endoscopic gastrostomy and nasogastric tube feeding in patients with persisting neurological dysphagia. British Medical Journal, 304(6839):1406.
  • Procaccini and Nemergut (2008) Procaccini, N. J. and Nemergut, E. C. (2008). Percutaneous endoscopic gastrostomy in the patient with amyotrophic lateral sclerosis: Risk vs benefit? Practical Gastroenterology, page 24.
  • Rosenbaum and Rubin (1983) Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41–55.
  • Rubin (1980) Rubin, D. B. (1980). Randomization analysis of experimental data: The fisher randomization test (comment). Journal of the American Statistical Association, 75(371):591–593.
  • Rubin (2006) Rubin, D. B. (2006). Causal inference through potential outcomes and principal stratification: application to studies with “censoring” due to death. Statistical Science, 21(3):299–309.
  • Schwartz et al. (2012) Schwartz, S., Li, F., and Reiter, J. P. (2012).

    Sensitivity analysis for unmeasured confounding in principal stratification settings with binary variables.

    Statistics in Medicine, 31(10):949–962.
  • Spiegelhalter et al. (2002) Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4):583–639.
  • Tanner and Wong (1987) Tanner, M. A. and Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. Journal of the American statistical Association, 82(398):528–540.
  • Zhang and Rubin (2003) Zhang, J. L. and Rubin, D. B. (2003). Estimation of causal effects via principal stratification when some outcomes are truncated by “death”. Journal of Educational and Behavioral Statistics, 28(4):353–368.
  • Zhang et al. (2009) Zhang, J. L., Rubin, D. B., and Mealli, F. (2009). Likelihood-based analysis of causal effects of job-training programs using principal stratification. Journal of the American Statistical Association, 104(485).
  • Zigler and Dominici (2014) Zigler, C. M. and Dominici, F. (2014). Uncertainty in propensity score estimation: Bayesian methods for variable selection and model-averaged causal effects. Journal of the American Statistical Association, 109(505):95–107.
  • Zigler et al. (2013) Zigler, C. M., Watts, K., Yeh, R. W., Wang, Y., Coull, B. A., and Dominici, F. (2013). Model feedback in bayesian propensity score estimation. Biometrics.