An Online Updating Approach for Testing the Proportional Hazards Assumption with Streams of Big Survival Data

by   Yishu Xue, et al.

The Cox model, which remains as the first choice in analyzing time-to-event data even for large datasets, relies on the proportional hazards assumption. When the data size exceeds the computer memory, the standard statistics for testing the proportional hazards assumption can no longer b e easily calculated. We propose an online up dating approach with minimal storage requirement that up dates the standard test statistic as each new block of data becomes available. Under the null hypothesis of proportional hazards, the proposed statistic is shown to have the same asymptotic distribution as the standard version if it could be computed with a super computer. In simulation studies, the test and its variant based on most recent data blocks maintain their sizes when the proportional hazards assumption holds and have substantial power to detect different violations of the proportional hazards assumption. The approach is illustrated with the survival analysis of patients with lymphoma cancer from the Surveillance, Epidemiology, and End Results Program. The proposed test promptly identified deviation from the proportional hazards assumption that was not captured by the test based on the entire data.


A comparative study to alternatives to the log-rank test

Studies to compare the survival of two or more groups using time-to-even...

Testing proportional hazards for specified covariates

Tests for proportional hazards assumption concerning specified covariate...

A general class of two-sample statistics for binary and time-to-event outcomes

We propose a class of two-sample statistics for testing the equality of ...

The hazard ratio is interpretable as an odds or a probability under the assumption of proportional hazards

Three statistical studies, all published between 2004 and 2008 but witho...

Testing for sufficient follow-up in censored survival data by using extremes

In survival analysis, it often happens that some individuals, referred t...

A Maximum Weighted Logrank Test in Detecting Crossing Hazards

In practice, the logrank test is the most widely used method for testing...

Weibull Racing Time-to-event Modeling and Analysis of Online Borrowers' Loan Payoff and Default

We propose Weibull delegate racing (WDR) to explicitly model surviving u...

Please sign up or login with your details

Forgot password? Click here to reset