A Flexible Joint Longitudinal-Survival Modeling Framework for Incorporating Multiple Longitudinal Biomarkers

07/26/2018 ∙ by Sepehr Akhavan-Masouleh, et al. ∙ University of California, Irvine 0

We are interested in survival analysis of hemodialysis patients for whom several biomarkers are recorded over time. Motivated by this challenging problem, we propose a general framework for multivariate joint longitudinal-survival modeling that can be used to examine the association between several longitudinally recorded covariates and a time-to-event endpoint. Our method allows for simultaneous modeling of longitudinal covariates by taking their correlation into account. This leads to a more efficient method for modeling their trajectories over time, and hence, it can better capture their relationship to the survival outcomes.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In this paper, we aim to examine the association between mortality and the longitudinal measurements of several biomarkers among hemodialysis patients. We are specifically focusing on the data obtained from a 5-year (January 2007-December 2011) cohort of 109,718 hemodialysis patients who were treated for dialysis in dialysis clinics in the United States. The dataset was studied in detail by Ravel et al. (2015)

. In such studies, it is common to measure multiple longitudinal biomarkers during the study follow up. Collected longitudinal measures on each subject tend to be correlated and temporally dependent as they are taken on the same subject. One could of course model the trajectory of each biomarker independently from other collected biomarkers. However, by simultaneously modeling the trajectory of all biomarkers, one can take the correlation between the different biomarkers into account. This could lead to a more precise model of biomarker trajectories. More specifically, when some biomarkers are measured less frequently compared to the other biomarkers (e.g., if they are difficult or expensive to measure), simultaneously modeling biomarkers can be particularly useful as one can gain more precision in estimating less frequent biomarkers by taking the correlation between all biomarkers into account and by borrowing information from the higher frequency biomarkers to better predict the lower frequency ones. Finally, a joint longitudinal-survival model with a flexible longitudinal component, which is capable of modeling the trajectory of multiple biomarkers simultaneously, can be used to test the association between the survival outcome and the longitudinal biomarkers.

Throughout this paper, we shall use the word ”joint” when we model longitudinal-survival data simultaneously. We reserve the word ”simultaneous” to refer to modeling multiple longitudinal measures at a same time rather than modeling each biomarker trajectory independently from others.

1.1 Some Related Methods

Simultaneously modeling longitudinal biomarkers can be considered in the context of a multivariate temporal process model that can be used both for inferential purposes as well as prediction and interpolation purposes. In order to develop such model, specification of a valid cross-covariance function is necessary. A valid cross-covariance function is required to lead to a valid positive-definite covariance matrix for any number of time points and at any choice of these time points

(Gelfand and Banerjee, 2010). A common cross-covariance function is to use separable cross-covariance function construction that shall be explained in more detail in Section 3.

The question of describing the correlation between multiple longitudinal measures has been addressed from a different perspective in geostatistics literature with a focus on spatial data. Bernardo et al. (1998) and Berger et al. (2003) described the kernel convolution technique for creating stationary and non-stationary spatial processes. Majumdar and Gelfand (2007) proposed producing cross-covariance functions by using convolution of covariance functions. Multivariate models in geostatistics started with Matheron (1973) and by introduction of some new concepts including cross-variogram and co-Kriging. Gelfand and Banerjee (2010) defined co-Kriging as a spatial prediction method that uses both the information of the process being considered as well as the information from other related processes. Co-Kriging is commonly addressed in the context of the linear models as these models are easily interpreted. These models that are commonly known as linear model of co-regionalization (LMC) have been considered widely in the literature including Grzebyk and Wackernagel (1994), Schmidt and Gelfand (2003), and Ver Hoef, Cressie and Barry (2004).

In a different work, we proposed a flexible joint longitudinal-survival Modeling framework for quantifying the association between a longitudinal biomarker and Survival Outcomes (Akhavan et al., 2018). We are now interested in extending our method to be able to model multiple biomarkers simultaneously. In Section 2, we will show how a Gaussian process model can be extended to a multivariate Gaussian process to model multiple longitudinal biomarkers simultaneously. In Section 3, using our proposed multivariate Gaussian process, we build a joint longitudinal-survival model capable of relating longitudinal biomarkers to survival outcome. In Section 4, we use several simulation studies to evaluate our proposed model. Section 5 presents our data analysis results.

2 Multivariate Gaussian Process

Consider the function that relates an input space to an output space . As an alternative to an explicit functional assumption on , one can assume a Gaussian process prior on . One may consider time as the input space and the space of a longitudinal measure as the output space and may use Gaussian process prior as a prior on all plausible functions relating time to the longitudinal measure at time .

In the setting above, is considered as a univariate function with one output at each time point . In general, however,

can be a multivariate function with a vector of outputs at each time

. A multivariate function will require a multivariate Gaussian process prior. One can consider a more general frame work of the following form

where is a vector of outputs at time , is a multivariate function with a multivariate Gaussian process prior with a mean vector function and a cross-covariance matrix function .

Consider a multivariate longitudinal vector at time with the dimension . Without loss of generality and for simplicity of the notations, we assume a multivariate longitudinal random vector with mean zero, . A cross-covariance function between two generic time points and is a matrix function with dimension where the element of this matrix is defined as


where and are . Equation (2.1) indicates that the cross-covariance matrix function can be defined as


Consider arbitrary time points . At each time point , is a vector. Concatenating n such output vectors, one can define an vector , where . Random vector is mean-zero with a covariance matrix with the dimension . is a block matrix where each block is the cross-covariance matrix corresponding to time and for all outputs at each time point.

As a covariance matrix, has to be symmetric and positive definite. As Gelfand and Banerjee (2010) showed, this requires the covariance matrix function to satisfy the two following conditions of

  1. ,

  2. for any integer value and for any arbitrary collection of ”n” time points:

A multivariate process is stationary if the cross-covariance matrix function depends only on the time difference between and , where we can write . Multivariate process is called isotropic if the cross-covariance matrix function depends only on the absolute difference between and , where we can write . Yadrenko (1987) showed that under isotropic condition, a covariance function will form a valid cross-covariance matrix function if and only if the function for a positive-definite measure has a cross-spectral representation of the form

Despite the existence of many methods proposed in the literature, when cross-covariance functions are unknown, specification of a valid cross-covariance function based on the observed data is a very difficult task. One common approach to specify a valid cross-covariance function is to use a class of covariance functions known as separable cross-covariance structures that shall be introduced in the next section.

Now consider as a vector of longitudinal variables all measured at time . The cross-covariance matrix function is then a matrix where it’s element is equal to that was defined in equation (2.1). One can assume that the covariance between the longitudinal variables at each time-point remains the same at all time points and can be specified with a positive definite covariance matrix . The correlation between measured values at time and measured values at time can then be expressed using a univariate correlation function . Given this specification, one can write


where is of size and represents the covariance matrix between the elements of that remain the same at all time points , and represents the correlation between measures at time and measures at time .

Now consider time points where at each time point , we observe a stack of

longitudinal random variables

. Consider as the vertical stack of longitudinal vectors, each element of that vector is an observed at a time point . can be defined as


where the element of is represented with and is equal to , the notation represents the Kronecker product, and is the non-temporal covariance function between the longitudinal variables that is assumed to remain the same across time points ’s.

By using a separable cross-covariance function and given that matrix is positive definite by definition, with a positive definite matrix , it’s guaranteed that will be positive definite. Furthermore, by using a separable cross-covariance structure, the determinant of the cross-covariance function, , and the inverse of the cross-covariance function,, will become computational convenient to deal with. In particular, by using the properties of the Kronecker product, one can show that the determinant of the cross-covariance can be written as , where and are the determinant of the matrix and the determinant of the , respectively. Also, the inverse of the cross-covariance function can be written as . We shall use the class of separable cross-covariance functions to setup our proposed model.

3 Joint Model

In this section, we first start by introducing the likelihood specification of our joint model. We then continue with introducing the longitudinal component of the model and the survival component. Our proposed method can work for any number of longitudinal biomarkers, however, for the sake of simplicity of the notations, we consider only two longitudinal biomarkers. We refer to the first longitudinal biomarker with and to the second biomarker with . We denote survival outcome with . refers to how many subjects are being followed up in the study. and refer to the number of longitudinal biomarker 1 measures and biomarker 2 measures obtained for subject at time points and , respectively. Also, associated with each subject, there is an observed survival time, and event indicator , where and denote the true event time and the censoring time for subject , respectively. Further, we make the common assumption that is independent of for all , .

We define the contribution of each subject to the joint model likelihood as the multiplication of the likelihood function of the longitudinal measures for that subject and her/his time-to-event likelihood that is conditioned on her/his longitudinal measures. Let , , and denote the longitudinal likelihood contribution, the conditional survival likelihood contribution, and the joint likelihood contribution for subject . One can write the joint longitudinal-survival likelihood function as


In what follows, we explain the components of this model.

3.1 Longitudinal Component

We motivate the development of the multivariate Gaussian process model of two longitudinal biomarkers by first considering the following simple model for a single subject


For simplicity of the notation, we assume that both biomarkers are measured simultaneously, that means at each time point , we get to obtain measures on both biomarkers. Our method is not limited to this assumption and once readers are introduced with the model, we shall extend the notation to a model with no such assumption. We define as the number of longitudinal measures per biomarker. These measures are obtained at an arbitrary time points . In the equation (3.2), and are vectors of longitudinal biomarker 1 measurements and longitudinal biomarker 2 measurements, each of size respectively. We shall stack and together into a column vector that is of size . is a vector of repeated random intercepts corresponding to biomarker 1 and is a vector of repeated random intercepts corresponding to biomarker 2, each of size . Also, we shall stack and into a column vector of size . The model in equation (3.2) assumes biomarker 1 and biomarker 2 are independent and each biomarker has its own measurement error matrix. We consider and , where and are shared parameters across all subjects.

By adding a stochastic component that is indexed by time to the model in equation (3.2), we can relax the independence assumption between biomarkers and we can also extend the model to capture non-linear patterns over time. Specifically, we consider a stochastic vector, , that is a realization from a multivariate Gaussian process prior, , that is mean zero and has a separable cross-covariance function. Thus for subject , , where is a column vector that includes a stack of and , where and .

We characterize the covariance function, , using a separable cross-covariance structure as


where is a 2 by 2 matrix that characterizes the the covariance between the two biomarkers and is assumed to be time-invariant, and is a temporal covariance matrix. The matrix is shared across all subjects with elements and

characterizing marginal variances of the first and the second biomarker processes respectively, and with the

element characterizing the covariance between the two processes. In specific, one can decompose the covariance matrix into the following form


where and are square-root of the within biomarker 1 and biomarker 2 variances and is a correlation matrix. Commonly, is set to equal 1 where in that case, it’s assumed that will also capture the within biomarker 1 variability and is treated as a parameter indicating the relative biomarker variability between biomarker 2 and biomarker 1. Hence, we define

We re-write the equation (3.5) as


is the covariance matrix characterizing how longitudinal measures change over time. Under the separable cross-covariance structure, we assume both biomarkers share the same covariance structure for changes in their values over time. We characterize as an matrix with elements that is defined as


where the hyperparameter

controls the correlation length, and controls the height of oscillations (Banerjee et al. 2004), and and are two different time points. For notational simplicity, we define . We can extend the longitudinal model in equation (3.2) to the flexible model below


where is a stochastic vectors sampled from a Gaussian process prior of the form


In the model defined by equation (3.7), and are assumed to be common across all subjects. Also, we assume the correlation length is fixed and hence, the subject-specific parameter will have the role of capturing the within-subject volatility of the longitudinal biomarkers. Finally, the longitudinal component of our proposed joint model can be written as


where and are random intercepts associated with biomarker 1 and biomarker 2, respectively. is a column vectors of size that is a stack of and each repeated times. Finally, matrix was decomposed based on the equation (3.5).

3.2 Survival Component

Our goal is to quantify the association between the longitudinal biomarkers of interest and the time-to-event outcomes by directly adjusting for biomarkers measured values in a survival component of our proposed joint model. While usually biomarkers are measured on a discrete lab-visit basis (ex. every month), the event of interest happens on a continuous basis. While common frequentist models use the so-called ”last-observation-carried” forward, by jointly modeling longitudinal-survival data, one can properly impute biomarker measures at each individual’s event time. In particular and from the Bayesian modeling perspective, in each MCMC iteration, given the sampled parameters for each individual and by using the flexible multivariate Gaussian process in the longitudinal component of the model, there exists posterior trajectories of biomarkers for each individual. Our method, then, considers the posterior mean of those trajectories as the proposed trajectory for each individual’s biomarker values over time at that iteration. The posterior mean trajectories of our biomarkers of interest, then, can be used to impute time-dependent biomarker covariates inside the survival component of the model.

In order to quantify the association between two longitudinal biomarkers, which are modeled simultaneously using the longitudinal component of the model, and the time-to-event outcomes, we define our survival component by using a multiplicative hazard model with the general form of

where is the event time for subject , denotes a baseline hazard function, is a vector of baseline covariates, are longitudinal covariates from the longitudinal component of the model, and and are regression coefficients interpretable as the log relative risk of ”death” per every unit increase of their corresponding covariates.

We consider a Weibull distribution for the survival component to allow for log-linear changes in the baseline hazard function over time. Thus we assume


that means


Weibull distribution is available in closed form and can be evaluated computationally efficiently. Under this parameterization of the Weibull distribution, covariates can be incorporated into the model by defining , where are coefficients associated with baseline survival covariates , and are longitudinal coefficients associated with longitudinal covariates .

In particular, we are interested in a model that directly includes the two longitudinal biomarker values at time as a covariate inside the survival model. Hence, we define our model as



where is a common shape parameter shared with all subjects. is a subject specific coefficient in the model which allows subject-specific baseline hazard. and are baseline covariates and their corresponding coefficients, respectively. Finally, and are coefficients linking the longitudinal biomarker1 value at time and longitudinal biomarker2 value at time to the hazard for mortality, respectively.

In order to fit a fully joint longitudinal-survival model, at each iteration of the MCMC and for a time-point , predicted biomarker1 and biomarker2 values for individual are of the following form

where is a vector where its first element is the predicted value of the first biomarker and its second element is the predicted value for the second biomarker. is a column vectors of size that represents the observed biomarker values, where its first elements are the observed biomarker1 values and the remaining elements are the observed biomarker2 values. is column vector of size that includes all time points at which values of the biomarkers were observed. is the time at which by using the posterior trajectory of biomarkers at each MCMC iteration, we want to impute a predicted value per biomarker. Given our proposed longitudinal component setup, is distributed bivariate Normal with mean and with covariance matrix that are of the following forms




where and were defined earlier in Section 3.1. is defined similarly as except that is replaced with . is a column vector size 2 by 1 with random intercept of the first biomarker, , as its first element and random intercept of the second biomarker, , as it’s second element. is a column vector of size , where the first elements are all the random intercept for the first biomarker and the remaining elements are all the random intercept for the second biomarker.

In order to avoid an explicit distributional assumption on the survival times, we specify our survival model as an infinite mixture of Weibull distributions mixed on the parameter. To do so, we use the Dirichlet process mixture of Weibull distributions defined as


where is a fixed parameter, is a subject-specific mean parameter from a distribution with a DP prior, is the concentration parameter of the DP and is the base distribution. By using the Dirichlet process prior on the distribution of , we allow patients with similar baseline hazards to cluster together which subsequently provides a stronger likelihood to estimate the baseline hazards. For other coefficients in the survival model, we assume a multivariate normal prior as

where is a set of coefficients associated with the baseline survival covariates, and are coefficients associated with value of the first and the second biomarkers respectively, is a prior variance for each coefficient, and

is the identity matrix.

For the shared shape parameter , we consider a log-Normal prior, , and specify the prior on the concentration parameter of our DP model to be .

Finally, since the hazard function includes time-varying covariates, evaluation of the log likelihood that involves integration of the hazard function over time is done using the rectangular integration discussed in detail in our previous paper (Akhavan et al., 2018). More details on posterior sampling and implementation of our method are provided in Appendix A.

4 Simulation Studies

In this section, in order to evaluate our proposed model, we provide two types of simulations. Type I simulations (4.1) only consider the longitudinal component of the model and are aimed to show the benefit of simultaneously modeling multiple longitudinal biomakers as opposed to modeling biomakers each separately. We shall refer to the former with multi and to the latter with uni. The second type of simulations, type II (4.2), are aimed to evaluate the performance of the proposed joint longitudinal-survival model.

4.1 Multivariate Gaussian Process Model vs. Multiple Univariate Gaussian Processes

One natural question in modeling multiple biomarkers is whether there is any benefit in modeling these biomakers simultaneously (multi) as opposed to modeling each biomaker process independently (uni). Type I simulations are design to evaluate our proposed multi model vs. an alternative uni model.

We generate synthetic data for two biomakers of albumin () and BMI (). Albumin and BMI values are simulated off of the multivariate Gaussian process model for 100 subjects and for each subject 60 albumin and 60 BMI values. The multivariate Gaussian model is of the following form (more details in the appendix A.1):


The values are simulated from the distribution. and are simulated from the and the distributions, respectively. Measurement errors and are both set to be equal to .

To explore the benefit of simultaneously modeling biomakers, we compare our proposed methodology here against an alternative modeling framework (Akhavan et al., 2018) that models biomarker processes independently. Our general comparison scheme is to randomly remove some of the obtained simulated biomarker values and treat them as missing. Then we use our fitted models in order to predict missing biomaker values. Models then are compared in terms of the prediction mean error (MSE) of the missing values. we consider the following three real world scenarios:

  1. Scenario 1: Comparison of the models in terms of the prediction MSE and as a function of correlation between biomakers and the amount of time overlap between missing biomarkers.

  2. Scenario 2: Here we tackle the real world scenario where on biomarker is observed less frequently compared to the other biomarker (ex. one biomarker is more expensive or more difficult to measure).

  3. Scenario 3: Is our proposed multivariate model capable of handling biomarker processes with different volatility given that our methodology assumes common values across biomarkers with as a common measure of volatility?

4.1.1 Scenario 1

Under this scenario, we remove one third of the biomarker values and treat them as missing. These 20 values are removed in the three following ways:

  1. When one biomarker is missing, the other biomarker is also missing (i.e. 100% missing overlap).

  2. When one biomarker is missing, the other biomarker is observed (i.e. 0% missing overlap).

  3. Half of the times both biomarkers are missing and half of the times, when one is missing, the other biomarker is observed (i.e. 50% missing overlap)

Table 3 shows the simulation results from 100 generated datasets where the multi and uni models are compared in terms of the average prediction MSE and as a function of biomarker correlations and missing values % overlap. Based on the simulation result, by taking the between-biomarker correlations into account, multi model leads to a smaller overall MSE compared to the uni model. In Table 3, %Dec. indicates the percentage decrease in MSE comparing the multi model and the uni model. As expected, as the correlation between the two processes increases, the multi model leads to a smaller MSE compared to the uni model. When the percent overlap between missing values decreases, indicating when one biomarker is missing, the other biomarker is more likely to be observed, the information from one biomarker can help the multi model better predict the missing biomarker, and hence, will lead to a smaller MSE.

4.1.2 Scenario 2

Under this simulation scenario, we consider the real world scenario where one biomarker, due to the cost of the procedure or the difficulty of obtaining biomarker measures, is measured less frequently compared to the other biomarker. We generate synthetic data for two biomarkers each with 60 measurements and at five different correlation levels between the two biomarker processes. We then randomly remove measurements from the first biomarker at the 20%, 50%, and 80% rate. Models are then compared in terms of the prediction MSE of the missing values. As the results in Table 3 show, the proposed multivariate model leads to lower MSEs compared to the univariate model. As expected, as the correlation between the two biomarker processes increases, multi model better takes advantage of the information from one biomarker to predict the other and led to a smaller MSE.

Correlation 100% Overlap 50% Overlap 0% Overlap
Multi Uni %Dec. Multi Uni %Dec. Multi Uni %Dec.
0.1 0.237 0.243 2.5% 0.202 0.206 1.9% 0.207 0.212 2.4%
0.3 0.201 0.207 2.9% 0.215 0.227 5.3% 0.206 0.217 5.1%
0.5 0.209 0.217 3.7% 0.209 0.228 9.3% 0.196 0.227 13.7%
0.7 0.207 0.217 4.6% 0.189 0.228 17.1% 0.185 0.235 21.3%
0.9 0.194 0.204 4.9% 0.158 0.219 27.9% 0.139 0.237 41.4%
Table 1: Scenario 1: Comparison of the multivariate model (multi) and the univariate model (uni) in terms of prediction MSE and as a function of biomarker correlation and the time-overlap between missing biomarkers (% overlap).
Correlation 20% Freq. 50% Freq. 80% Freq
Multi Uni %Dec. Multi Uni %Dec. Multi Uni %Dec.
0.1 0.410 0.420 2.4% 0.281 0.286 1.8% 0.181 0.185 2.2%
0.3 0.391 0.423 7.6% 0.248 0.261 5.0% 0.188 0.197 4.6%
0.5 0.343 0.413 17.0% 0.226 0.264 14.4% 0.165 0.187 11.8%
0.7 0.319 0.407 21.6% 0.222 0.266 16.6% 0.152 0.188 19.2%
0.9 0.294 0.397 26.0% 0.190 0.245 22.5% 0.126 0.189 33.3%
Table 2: Scenario 2: Comparison of the models in terms of the prediction MSE and as a function of biomarker correlation and how frequently less-observed biomaker has been actually observed (% Freq).
Correlation 100% Overlap 50% Overlap 0% Overlap
Multi Uni %Dec. Multi Uni %Dec. Multi Uni %Dec.
0.1 0.263 0.270 2.6% 0.250 0.260 3.9% 0.258 0.267 3.1%
0.3 0.247 0.256 3.6% 0.252 0.263 4.1% 0.249 0.265 5.9%
0.5 0.264 0.275 3.8% 0.232 0.250 6.9% 0.252 0.277 9.1%
0.7 0.236 0.247 4.4% 0.254 0.282 10.2% 0.207 0.243 14.7%
0.9 0.231 0.245 5.9% 0.211 0.254 16.8% 0.214 0.275 22.4%
Table 3: Scenario 3: Comparison of the models in terms of prediction MSE and as a function of biomarker correlation and the time-overlap between missing biomarkers (% overlap).

4.1.3 Scenario 3

In building our proposed multivariate longitudinal model, we assumed a separable cross-covariance structure where the model assumes that different longitudinal processes share the same temporal covariance. In particular, our model assumes a shared volatility measure across different biomarkers. Obviously, by using an independent univariate Gaussian processes model (Uni) one can estimate different values for each biomaker process. We, however, claim that despite a shared across longitudinal biomarkers in our proposed multivariate model (Multi), the model is still robust as the model can capture the within-biomarker variances through the covariance matrix (equation (3.5)). To show this, we simulate two longitudinal biomarkers with different volatility measure values. Table 3 supports our claim as our proposed multi model still leads to smaller MSEs compared to the uni model.

4.2 Simulation Studies on the Proposed Joint Multivariate Longitudinal-Survival Model

Our second type of simulations are focused on evaluating the performance of our proposed joint multivariate longitudinal-survival model. We simulated 200 datasets that resembled the real data on end-stage renal disease patients that was obtained from the United States Renal Data System (USRDS). Each dataset included subjects. We simulated longitudinal trajectories for the two biomarkers of albumin () and BMI (), with within subject albumin and within subject BMI measures. Both biomarkers are generated from a joint multivariate Gaussian process with a high-correlation of between the processes. In particular, biomarker measures for each subject are simulated from


is a stack of two subject specific random intercepts for the two processes. Subject-specific random intercepts for the albumins are simulated from the Normal distribution

and the random intercepts for the BMIs are simulated from the Normal distribution . and are both set to . is considered to be equal to , where ’s are simulated from the uniform distribution. is the distance matrix of the form , with a fixed of 0.1 and time points ’s that are sampled uniformly from a followup time with a maximum of 15 months. The matrix is the covariance matrix of the two biomarkers with a high correlation of between the processes and with scaled variances for each biomarker process. Survival times are generated from a Weibull distribution of the form

with the shape parameter of the distribution, , set to 1.5 and the log-scale parameter that is of the form

where were sampled from an equally weighted mixture of two Normal distributions of and . is fixed to 0.5 and

is fixed to -0.3. Censoring times are generated from a uniform distribution and independent of the survival times

with parameters that ensure a censoring rate.

In order to evaluate the performance of our proposed model, we also fit a last-observation carried forward proportional hazard Cox model as well as an alternative joint longitudinal-survival model that models biomarkers independently (Akhavan et al., 2018).

Albumin and BMI random intercepts are assumed to have the priors and , respectively. ’s are assumed to have the prior. Both and had the prior. We assumed matrix to be of the form , where is a matrix with the diagonal elements of 1 and and is the correlation matrix between the two biomarkers that is of size . We consider a prior on and an prior on . We put the prior on the Weibull shape parameter . We also consider independent priors on and . are assumed to be distributed according to an unknown distribution with the Dirichlet process prior. We consider the prior on and we consider to be the standard Normal distribution .

Table 4 shows the simulation results. As expected, the last observation carried forward Cox model leads to estimates that are shrunk towards 0 as this model is blind to the differential subject-specific baseline hazards that are induced by subject-specific values. The estimates under the Cox model are marginalized over all subjects and due to non-collapsibility of these models, the estimates are shrunk toward the null. Further, this model carries the most recent longitudinal measures forward to the event time where as the longitudinal measures at the event time might be quite different from the most recent measures. This is also caused the estimates to shrink towards 0. Next, the joint model with univariate longitudinal biomarkers provides estimates that are closer to the true values compared to the Cox model as the model is capable of detecting baseline subject-specific values as well as providing a good prediction of the two biomarkers at time by flexibly modeling the trajectory of the biomarkers. Third, our joint multivariate longitudinal-survival model proposed here is capable of modeling multiple biomarkers simultaneously by taking the correlation between the biomarkers and hence, leads to even closer coefficient estimates compared to our joint univariate longitudinal-survival model.

Covariate of True Conditional LOCF Cox Uni. Joint Model Multi Joint Model
Interest Estimand Mean SD MSE Mean SD MSE Mean SD MSE
Albumin(t) 0.5 0.242 0.089 0.077 0.443 0.124 0.012 0.481 0.101 0.009
BMI(t) -0.3 -0.135 0.055 0.027 -0.278 0.127 0.003 -0.287 0.109 0.003
Table 4:

Coefficient estimates under different models along with the corresponding standard deviation and mean-squared error values per estimated coefficient.

5 Application of the Proposed Joint Multivariate Longitudinal Survival Model to DaVita Data on Hemodialysis Patients

In this section, we apply our proposed joint multivariate longitudinal-survival model to a specific study of hemodialysis patients discussed in introduction. Here, we focus on a subset of data where every patient has at least 8 longitudinal measures of albumin and at least 8 longitudinal measures of calcium. This subset includes hemodialysis patients with an overall censoring rate in the data is 25.6%.

Although the longitudinal albumin and calcium biomarkers are supposed to be measured during every lab visit, however, we noticed that in our study cohort, on average at 12.4% of lab visits neither of these two biomarkers were measured. We also noticed that on average in 26.8% of lab visits there were no measured albumin biomarker and in 15.1% lab visits, there were no measured calcium biomarker.

With the aim of testing the association between mortality and the value of the biomarkers of interest, albumin and calcium, among hemodialysis patients, we analyze the data once using last-observation carried forward Cox model, another time using a univariate joint longitudinal-survival model (Uni. Joint Model), and finally using our recent multivariate joint longitudinal-survival model (Multi. Joint Model). In order to adjust for potential confounder factors, other than longitudinal albumin and calcium measures, we also include age, sex, race, a baseline measure of phosphorus, and a baseline measure of iron.

Unlike the last observation carried forward Cox model that uses the most recent albumin and calcium biomarker values as the values of these biomarkers at each event time, our proposed joint longitudinal-survival models flexibly model the trajectory of these biomarkers over time and at each event time, the models impute the most relevant biomarker values according to the trajectories of those biomarkers. Further, unlike the Cox model that marginalize covariate effects across all subjects, our proposed joint models are capable of detecting differential subject-specific baseline hazards. Our univariate joint longitudinal-survival method models the longitudinal albumin and calcium processes independently of each other. Our proposed multivariate joint longitudinal-survival model introduced in this paper, however, models the two processes simultaneously. Simultaneously modeling the two processes will allow a better longitudinal trajectory specification as the model can borrow information from measurements of one process when measurements of the other process are missing.

Table 5 shows the results of analyzing the data using the three models of last-observation carried forward Cox, our proposed univariate joint longitudinal-survival model, our proposed multivariate joint longitudinal-survival model. The results from all three models consistently show that age and albumin value at the time of death are significant risk factors of mortality among hemodialysis patients. The results from the Cox model show that each 1 g/dL decrement in albumin level corresponds to 3.4 times higher risk of death. The risk of death per each 1 g/dL decrement in albumin is estimated to be 4.2 and 5.1 times higher under our proposed univariate joint model and multivariate joint model, respectively. Further, compared to the univariate joint model, the multivariate joint model leads to 32% and 36% reduction in the 95% credible region of the estimated effect of every one unit decrement in albumin and calcium, respectively. This observation was expected as the multivariate joint model by simultaneously modeling albumin and calcium trajectories, estimate those trajectories with higher precision compared with the univariate joint model.

LOCF Cox Model Uni. Joint Model Multi. Joint Model
No. of No. of Relative Risk Relative Risk Relative Risk
Covariates Cases Deaths (95% CI) (95% CR) (95% CR)
Age (10y) 929 691 1.33 (1.20,1.48) .001 1.50 (1.33,1.68) 1.56 (1.35,1.76)
          Men 546 412 1.0 1.0 1.0
          Women 383 279 1.02 (0.78,1.32) 0.90 1.01 (0.72,1.41) 1.01 (0.73,1.41)
          White 489 337 1.0 1.0
          Black 264 204 1.79 (0.90,3.58) 0.11 1.76 (0.82,3.69) 1.78 (0.84,3.76)
          Hispanic 118 102 0.71 (0.40,1.25) 0.23 0.71 (0.39,1.26) 0.73 (0.37,1.28)
          Other 58 48 0.55 (0.28,1.11) 0.10 0.57 (0.20,1.18) 0.58 (0.20,1.18)
(mg/dL) 929 691 1.07 (0.99,1.17) 0.08 1.10 (0.99,1.23) 1.10 (0.99,1.23)
(g/dL) 929 691 0.99 (0.98,0.99) 0.002 0.98 (0.94,1.02) 0.98 (0.94,1.02)
Serum albumin(t)
(1-g/dL decrement) 929 691 3.36 (2.64,4.29) 0.0001 4.17 (2.78,5.72) 5.11 (3.86,6.29)
(mg/dL) 929 691 1.09 (0.87,1.37) 0.45 1.19 (0.72,1.86) 1.27 (0.92,1.69)
Table 5: Results of analyzing the association between the longitudinal albumin and calcium biomarkers and mortality among hemodialysis patients. A cohort of 929 hemodialysis subjects were followed over a maximal follow-up time of 5 years. Three separate models of last-observation carried forward Cox, univariate joint longitudinal-survival model, and multivariate joint longitudinal-survival model were fit to the data.

6 Discussion

When monitoring the health of subjects, often times multiple risk factors are measured over time. Collected longitudinal risk factors are often correlated with each other as they are measures taken on the same subject. Modeling these longitudinal risk factors simultaneously where the correlation between the risk factors are taken into account can be beneficial, specially when there exists differential measuring density in the collected risk factors. Further, the association between the collected risk factors and the survival outcomes is often the practitioners’ primary interest. In this paper, we proposed a joint longitudinal-survival modeling framework with a longitudinal component capable of modeling multiple longitudinal processes simultaneously with the correlation between those processes taken into account. Our modeling framework is robust to common distributional assumptions as by using the Bayesian non-parameteric Gaussian process and Dirichlet process techniques, we avoid common functional and distributional assumptions in the model.

We used synthetic data in order to show the benefit of simultaneously modeling the trajectories of multiple longitudinal processes using our proposed multivariate longitudinal model as opposed to separate independent longitudinal models each modeling the trajectory of one longitudinal process independently from other longitudinal processes. Our findings show that a multivariate model has more precision in estimating the underlying trajectories of the longitudinal risk factors. Next, using synthetic data we showed that our proposed joint multivariate longitudinal-survival model performs better in terms of mean-squared error of the estimated survival coefficients compared to the modeling framework we introduced elsewhere where the longitudinal biomarkers modeled independently (Akhavan et al., 2018).

Our proposed modeling framework has some limitations. Our modeling framework is limited to the proportional hazards models only. Further, our method is computationally demanding and may not be scalable as number of subjects and within-subject measurements increase. In future, our modeling framework can be extended by relaxing the proportional hazard assumption on the survival component. Also, one by using alternatives to the conventional MCMC techniques, including variational methods can make our modeling framework more computationally efficient.

In order to test the association between the longitudinal albumin and calcium biomarkers and mortality among hemodialysis patients, we used data on 929 hemodialysis patients. We analyzed the data using three models of last-observation carried forward Cox model, the univariate joint longitudinal-survival model we proposed elsewhere (Akhavan et al., 2018), and the multivariate longitudinal-survival model that was proposed in this paper. While the results are consistent across all models, our proposed multivariate joint model that is capable of modeling the trajectory of longitudinal biomarkers with higher precision, leads to stronger estimated biomarker with higher precision for the estimated effect.


Babak Shahbaba was supported by the NIH grant R01AI107034.


  • Akhavan et al. (2018) [author] Akhavan, S.S., Holsclaw, T.T., Shahbaba, B.B. Gillen, D. L.D. L. (2018). A Flexible Joint Longitudinal-Survival Model for Analysis of End-Stage Renal Disease Data. ArXiv e-prints.
  • Berger et al. (2003)

    [author] Berger, JM Bernardo MJ Bayarri JOJ. B. M. B. J., Dawid, APA., Smith, D Heckerman AFMD. H. A. West, MM. (2003). Markov chain Monte Carlo-based approaches for inference in computationally intensive inverse problems. In Bayesian Statistics 7: Proceedings of the Seventh Valencia International Meeting 181. Oxford University Press.

  • Bernardo et al. (1998) [author] Bernardo, JMJ., Berger, JOJ., Dawid, APA., Smith, AFMA. et al. (1998). Non-Stationary Spatial Modeling.
  • Flaxman et al. (2015) [author] Flaxman, SethS., Gelman, AndrewA., Neill, DanielD., Smola, AlexA., Vehtari, AkiA. Wilson, Andrew GordonA. G. (2015). Fast hierarchical Gaussian processes. Manuscript in preparation.
  • Gelfand and Banerjee (2010) [author] Gelfand, Alan EA. E. Banerjee, SudiptoS. (2010). Multivariate spatial process models. Handbook of Spatial Statistics 495–515.
  • Gelman et al. (2014) [author] Gelman, AndrewA., Carlin, John BJ. B., Stern, Hal SH. S. Rubin, Donald BD. B. (2014). Bayesian data analysis 2. Chapman & Hall/CRC Boca Raton, FL, USA.
  • Grzebyk and Wackernagel (1994)

    [author] Grzebyk, MichelM. Wackernagel, HansH. (1994). Multivariate analysis and spatial/temporal scales: real and complex models. In Proceedings of the XVIIth International Biometrics Conference 1 19–33. Citeseer.

  • Majumdar and Gelfand (2007) [author] Majumdar, AnandamayeeA. Gelfand, Alan EA. E. (2007). Multivariate spatial modeling for geostatistical data using convolved covariance functions. Mathematical Geology 39 225–245.
  • Matheron (1973)

    [author] Matheron, GeorgesG. (1973). The intrinsic random functions and their applications. Advances in applied probability 439–468.

  • Neal (2011) [author] Neal, Radford M.R. M. (2011). Handbook of Markov Chain Monte Carlo. Chapman & Hall CRC.
  • Ravel et al. (2015) [author] Ravel, VanessaV., Streja, ElaniE., Molnar, Miklos ZM. Z., Rezakhani, SepidehS., Soohoo, MelissaM., Kovesdy, Csaba PC. P., Kalantar-Zadeh, KamyarK. Moradi, HamidH. (2015). Association of aspartate aminotransferase with mortality in hemodialysis patients. Nephrology Dialysis Transplantation gfv310.
  • Schmidt and Gelfand (2003) [author] Schmidt, Alexandra MA. M. Gelfand, Alan EA. E. (2003). A Bayesian coregionalization approach for multivariate pollutant data. Journal of Geophysical Research: Atmospheres 108.
  • Ver Hoef, Cressie and Barry (2004)

    [author] Ver Hoef, Jay MJ. M., Cressie, NoelN. Barry, Ronald PaulR. P. (2004). Flexible spatial models for kriging and cokriging using moving averages and the Fast Fourier Transform (FFT). Journal of Computational and Graphical Statistics 13 265–282.

  • Yadrenko (1987) [author] Yadrenko, MIM. (1987). Correlation theory of stationary and related random functions, Vol I, Basic results.

Appendix A Implementation

Consider the joint longitudinal-survival likelihood function, , introduced in equation 3.1. Let be a vector of all model parameters with the joint prior distribution . The posterior distribution of the parameter vector can be written as


where and denote longitudinal and time-to-event data respectively, and is the joint model likelihood function (equation 3.1).

The posterior distribution of the parameters in our proposed joint model is not available in closed form. Hence, samples from the posterior distribution of the model parameters are obtained via Markov Chain Monte Carlo (MCMC) methods. In particular, we use the Hamiltonian Monte Carlo (Neal, 2011) to draw samples from the posterior distribution. Prior distributions on parameters of the joint model were explained in details under the longitudinal and survival component specification, and we assume independence among model parameters in the prior (ie. is the product of the prior components specified previously). We provide further detail on less standard techniques for sampling from the posterior distribution when using a multivariate GP prior and we explain how to evaluate the survival portion of the likelihood function when time-varying covariates are incorporated into the model.

a.1 Evaluation of the Longitudinal Likelihood

Consider equation (3.7) and equation (3.8) where we introduced a flexible longitudinal model to simultaneously model multiple longitudinal biomarkers by using the Gaussian process prior. By marginalizing over in equation (3.7), one can show


In order to sample from the posterior distribution of the parameters of the joint longitudinal-survival model introduced in Section 3

, at each iteration of the MCMC, we need to compute the log posterior probability. Computing the log posterior probability involves evaluation of

and . This requires a memory space of and a computation time of per subject . Consider matrix , where the element of is . can be pre-computed prior to starting the MCMC process. Further, by using the eigen-value decomposition technique, one may make the calculation of the matrix determinant and inverse of the covariance matrix more computationally efficient. Using a similar idea proposed by Flaxman et al. (2015), we propose the following fast multivariate Gaussian process computation approach. We start pre-computing the matrix. Also, we can pre-compute the eigen-vale decomposition of this matrix prior to starting the MCMC process. Consider an eigen-value decomposition of the following form

where is a matrix of eigen-vectors and is a diagonal matrix of eigen-values. For a scalar , the eigen-value decomposition of is of the form

At each iteration of the MCMC, one can obtain the eigen-value decomposition of the matrix ,that is of the form

where is a matrix of eigen-vectors and is a diagonal matrix of eigen-values. One can then compute efficiently the log-determinant of the cross-covariance matrix, , as


and the inverse of the cross-covariance matrix, , can be efficiently computed as


In equation (A.4), computation of the inverse of the term in the middle is very easy as it’s a diagonal matrix. Using our proposed efficient computation technique introduced here, we noticed a 30 times faster computation speed in our simulations.

a.2 Evaluation of the Survival Likelihood

We evaluate the survival likelihood using piece-wise integration. Consider the survival time for subject that is denoted by and is distributed according to a Weibull distribution with shape parameter and scale parameter , where