Impact of model misspecification in shared frailty survival models
Survival models incorporating random effects to account for unmeasured heterogeneity are being increasingly used in biostatistical and applied research. Specifically, unmeasured covariates whose lack of inclusion in the model would lead to biased, inefficient results are commonly modelled by including a subject-specific (or cluster-specific) frailty term that follows a given distribution (e.g. Gamma or log-Normal). Despite that, in the context of parametric frailty models little is known about the impact of misspecifying the baseline hazard, the frailty distribution, or both. Therefore, our aim is to quantify the impact of such misspecification in a wide variety of clinically plausible scenarios via Monte Carlo simulation, using open source software readily available to applied researchers. We generate clustered survival data assuming various baseline hazard functions, including mixture distributions with turning points, and assess the impact of sample size, variance of the frailty, baseline hazard function, and frailty distribution. Models compared include standard parametric distributions and more flexible spline-based approaches; we also included semiparametric Cox models. Our results show the importance of assessing model fit with respect to the baseline hazard function and the distribution of the frailty: misspecifying the former leads to biased relative and absolute risk estimates while misspecifying the latter affects absolute risk estimates and measures of heterogeneity. The resulting bias can be clinically relevant. In conclusion, we highlight the importance of fitting models that are flexible enough and the importance of assessing model fit. We illustrate our conclusions with two applications using data on diabetic retinopathy and bladder cancer.
READ FULL TEXT 
  
  
     share
 share