Approximating Partial Likelihood Estimators via Optimal Subsampling

10/10/2022
by   Haixiang Zhang, et al.
0

With the growing availability of large-scale biomedical data, it is often time-consuming or infeasible to directly perform traditional statistical analysis with relatively limited computing resources at hand. We propose a fast and stable subsampling method to effectively approximate the full data maximum partial likelihood estimator in Cox's model, which reduces the computational burden when analyzing massive survival data. We establish consistency and asymptotic normality of a general subsample-based estimator. The optimal subsampling probabilities with explicit expressions are determined via minimizing the trace of the asymptotic variance-covariance matrix for a linearly transformed parameter estimator. We propose a two-step subsampling algorithm for practical implementation, which has a significant reduction in computing time compared to the full data method. The asymptotic properties of the resulting two-step subsample-based estimator is established. In addition, a subsampling-based Breslow-type estimator for the cumulative baseline hazard function and a subsample estimated survival function are presented. Extensive experiments are conducted to assess the proposed subsampling strategy. Finally, we provide an illustrative example about large-scale lymphoma cancer dataset from the Surveillance, Epidemiology,and End Results Program.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/05/2023

Optimal subsampling for the Cox proportional hazards model with massive survival data

The use of massive survival data has become common in survival analysis....
research
04/13/2023

On the asymptotic properties of a bagging estimator with a massive dataset

Bagging is a useful method for large-scale statistical analysis, especia...
research
01/28/2020

Optimal subsampling for quantile regression in big data

We investigate optimal subsampling for quantile regression. We derive th...
research
11/11/2020

Maximum sampled conditional likelihood for informative subsampling

Subsampling is a computationally effective approach to extract informati...
research
09/29/2021

Asymptotic Properties of the Maximum Smoothed Partial Likelihood Estimator in the Change-Plane Cox Model

The change-plane Cox model is a popular tool for the subgroup analysis o...
research
08/04/2023

Information Geometry and Asymptotics for Kronecker Covariances

We explore the information geometry and asymptotic behaviour of estimato...

Please sign up or login with your details

Forgot password? Click here to reset