Unlocking Retrospective Prevalent Information in EHRs – a Pairwise Pseudolikelihood Approach

09/03/2023
by   Nir Keret, et al.
0

Typically, electronic health record data are not collected towards a specific research question. Instead, they comprise numerous observations recruited at different ages, whose medical, environmental and oftentimes also genetic data are being collected. Some phenotypes, such as disease-onset ages, may be reported retrospectively if the event preceded recruitment, and such observations are termed “prevalent". The standard method to accommodate this “delayed entry" conditions on the entire history up to recruitment, hence the retrospective prevalent failure times are conditioned upon and cannot participate in estimating the disease-onset age distribution. An alternative approach conditions just on survival up to recruitment age, plus the recruitment age itself. This approach allows incorporating the prevalent information but brings about numerical and computational difficulties. In this work we develop consistent estimators of the coefficients in a regression model for the age-at-onset, while utilizing the prevalent data. Asymptotic results are provided, and simulations are conducted to showcase the substantial efficiency gain that may be obtained by the proposed approach. In particular, the method is highly useful in leveraging large-scale repositories for replicability analysis of genetic variants. Indeed, analysis of urinary bladder cancer data reveals that the proposed approach yields about twice as many replicated discoveries compared to the popular approach.

READ FULL TEXT
research
05/27/2019

Marginalized Frailty-Based Illness-Death Model: Application to the UK-Biobank Survival Data

The UK Biobank is a large-scale health resource comprising genetic, envi...
research
10/14/2020

Incorporating survival data into case-control studies with incident and prevalent cases

Typically, case-control studies to estimate odds-ratios associating risk...
research
03/16/2018

Inference for case-control studies with incident and prevalent cases

We propose and study a fully efficient method to estimate associations o...
research
12/03/2020

Optimal Cox Regression Subsampling Procedure with Rare Events

Massive sized survival datasets are becoming increasingly prevalent with...
research
09/25/2020

Predicting Parkinson's Disease with Multimodal Irregularly Collected Longitudinal Smartphone Data

Parkinsons Disease is a neurological disorder and prevalent in elderly p...
research
09/08/2021

Age-Aware Stochastic Hybrid Systems: Stability, Solutions, and Applications

In this paper, we analyze status update systems modeled through the Stoc...
research
03/13/2018

Regularized hazard estimation for age-period-cohort analysis

In epidemiological or demographic studies, with variable age at onset, a...

Please sign up or login with your details

Forgot password? Click here to reset