Correlation-Adjusted Survival Scores for High-Dimensional Variable Selection

02/22/2018
by   Thomas Welchowski, et al.
0

Background: The development of classification methods for personalized medicine is highly dependent on the identification of predictive genetic markers. In survival analysis it is often necessary to discriminate between influential and non-influential markers. Usually, the first step is to perform a univariate screening step that ranks the markers according to their associations with the outcome. It is common to perform screening using Cox scores, which quantify the associations between survival and each of the markers individually. Since Cox scores do not account for dependencies between the markers, their use is suboptimal in the presence highly correlated markers. Methods: As an alternative to the Cox score, we propose the correlation-adjusted regression survival (CARS) score for right-censored survival outcomes. By removing the correlations between the markers, the CARS score quantifies the associations between the outcome and the set of "de-correlated" marker values. Estimation of the scores is based on inverse probability weighting, which is applied to log-transformed event times. For high-dimensional data, estimation is based on shrinkage techniques. Results: The consistency of the CARS score is proven under mild regularity conditions. In simulations, survival models based on CARS score rankings achieved higher areas under the precision-recall curve than competing methods. Two example applications on prostate and breast cancer confirmed these results. CARS scores are implemented in the R package carSurv. Conclusions: In research applications involving high-dimensional genetic data, the use of CARS scores for marker selection is a favorable alternative to Cox scores even when correlations between covariates are low. Having a straightforward interpretation and low computational requirements, CARS scores are an easy-to-use screening tool in personalized medicine research.

READ FULL TEXT

page 14

page 15

page 17

page 18

page 20

page 21

page 24

research
02/22/2018

Correlation-Adjusted Regression Survival Scores for High-Dimensional Variable Selection

Background: The development of classification methods for personalized m...
research
05/09/2023

High-dimensional Feature Screening for Nonlinear Associations With Survival Outcome Using Restricted Mean Survival Time

Feature screening is an important tool in analyzing ultrahigh-dimensiona...
research
06/13/2021

AutoScore-Survival: Developing interpretable machine learning-based time-to-event scores with right-censored survival data

Scoring systems are highly interpretable and widely used to evaluate tim...
research
10/03/2022

Factor-Augmented Regularized Model for Hazard Regression

A prevalent feature of high-dimensional data is the dependence among cov...
research
07/22/2021

Inference for High Dimensional Censored Quantile Regression

With the availability of high dimensional genetic biomarkers, it is of i...
research
03/09/2021

Multimodal fusion using sparse CCA for breast cancer survival prediction

Effective understanding of a disease such as cancer requires fusing mult...

Please sign up or login with your details

Forgot password? Click here to reset