DeepAI AI Chat
Log In Sign Up

Nationally Representative Individualized Risk Estimation Combining Individual Data from Epidemiologic Studies and Representative Surveys with Summary Statistics from Disease Re

by   Lingxiao Wang, et al.

Estimating individualized absolute risks is fundamental to clinical decision-making but are often based on data that does not represent the target population. Current methods improve external validity by including data from population registries but require transportability assumptions of model parameters (relative risks and/or population attributable risks) from epidemiologic studies to the population. We propose a two-step weighting procedure to estimate absolute risk of an event (in the absence of competing events) in the target population without transportability assumptions. The first step improves external-validity for the cohort by creating "pseudoweights" for the cohort using a scaled propensity-based kernel-weighting method, which fractionally distributes sample weights from external probability reference survey units to cohort units, according to their kernel smoothed distance in propensity score. The second step poststratifies the pseudoweighted events in the cohort to a population disease registry by variables available in the registry. Our approach produces design-consistent absolute risks under correct specification of the propensity model. Poststratification improves efficiency and further reduces bias of risk estimates overall and by demographic variables available in the registry when the true propensity model is unknown. We apply our methods to develop a nationally representative all-cause mortality risk model for potential clinical use.


page 1

page 2

page 3

page 4


Risk Projection for Time-to-event Outcome Leveraging Summary Statistics With Source Individual-level Data

Predicting risks of chronic diseases has become increasingly important i...

Kpop: A kernel balancing approach for reducing specification assumptions in survey weighting

With the precipitous decline in response rates, researchers and pollster...

Efficient and Robust Propensity-Score-Based Methods for Population Inference using Epidemiologic Cohorts

Most epidemiologic cohorts are composed of volunteers who do not represe...

Doubly Robust Inference when Combining Probability and Non-probability Samples with High-dimensional Data

Non-probability samples become increasingly popular in survey statistics...

Risk-Stratify: Confident Stratification Of Patients Based On Risk

A clinician desires to use a risk-stratification method that achieves co...

One-step TMLE to target cause-specific absolute risks and survival curves

This paper considers one-step targeted maximum likelihood estimation met...

A new weighting method when not all the events are selected as cases in a nested case-control study

Nested case-control (NCC) is a sampling method widely used for developin...