Efficient Estimation of the Maximal Association between Multiple Predictors and a Survival Outcome

12/21/2021
by   Tzu-Jung Huang, et al.
0

This paper develops a new approach to post-selection inference for screening high-dimensional predictors of survival outcomes. Post-selection inference for right-censored outcome data has been investigated in the literature, but much remains to be done to make the methods both reliable and computationally-scalable in high-dimensions. Machine learning tools are commonly used to provide predictions of survival outcomes, but the estimated effect of a selected predictor suffers from confirmation bias unless the selection is taken into account. The new approach involves construction of semi-parametrically efficient estimators of the linear association between the predictors and the survival outcome, which are used to build a test statistic for detecting the presence of an association between any of the predictors and the outcome. Further, a stabilization technique reminiscent of bagging allows a normal calibration for the resulting test statistic, which enables the construction of confidence intervals for the maximal association between predictors and the outcome and also greatly reduces computational cost. Theoretical results show that this testing procedure is valid even when the number of predictors grows superpolynomially with sample size, and our simulations support that this asymptotic guarantee is indicative the performance of the test at moderate sample sizes. The new approach is applied to the problem of identifying patterns in viral gene expression associated with the potency of an antiviral drug.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/14/2021

Structured Bayesian variable selection for multiple related response variables and high-dimensional predictors

It is becoming increasingly common to study the complex association betw...
research
06/14/2020

Survival Analysis meets Counterfactual Inference

There is growing interest in applying machine learning methods for count...
research
09/15/2023

CAT: a conditional association test for microbiome data using a leave-out approach

In microbiome analysis, researchers often seek to identify taxonomic fea...
research
10/28/2019

Estimation and inference for the indirect effect in high-dimensional linear mediation models

Mediation analysis is difficult when the number of potential mediators i...
research
04/04/2023

Characterizing the contribution of dependent features in XAI methods

Explainable Artificial Intelligence (XAI) provides tools to help underst...
research
08/18/2022

A Decorrelating and Debiasing Approach to Simultaneous Inference for High-Dimensional Confounded Models

Motivated by the simultaneous association analysis with the presence of ...
research
10/26/2020

Accurate Prediction of Neuroblastoma Outcome based on miRNA Expression Profiles

For neuroblastoma, the most common extracranial tumour of childhood, ide...

Please sign up or login with your details

Forgot password? Click here to reset