Integrative data analysis where partial covariates have complex non-linear effects by using summary information from a real-world data

03/06/2023
by   Jia Liang, et al.
0

A full parametric and linear specification may be insufficient to capture complicated patterns in studies exploring complex features, such as those investigating age-related changes in brain functional abilities. Alternatively, a partially linear model (PLM) consisting of both parametric and non-parametric elements may have a better fit. This model has been widely applied in economics, environmental science, and biomedical studies. In this paper, we introduce a novel statistical inference framework that equips PLM with high estimation efficiency by effectively synthesizing summary information from external data into the main analysis. Such an integrative scheme is versatile in assimilating various types of reduced models from the external study. The proposed method is shown to be theoretically valid and numerically convenient, and it enjoys a high-efficiency gain compared to classic methods in PLM. Our method is further validated using UK Biobank data by evaluating the risk factors of brain imaging measures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2019

Shapley regressions: A framework for statistical inference on machine learning models

Machine learning models often excel in the accuracy of their predictions...
research
06/12/2021

Regression inference for multiple populations by integrating summary-level data using stacked imputations

There is a growing need for flexible general frameworks that integrate i...
research
06/28/2023

Mediation with External Summary Statistic Information (MESSI)

Environmental health studies are increasingly measuring endogenous omics...
research
11/29/2019

Functional marked point processes – A natural structure to unify spatio-temporal frameworks and to analyse dependent functional data

This paper treats functional marked point processes (FMPPs), which are d...
research
10/01/2022

Paradoxes and resolutions for semiparametric fusion of individual and summary data

Suppose we have available individual data from an internal study and var...
research
04/02/2021

Distributional data analysis with accelerometer data in a NHANES database with nonparametric survey regression models

Accelerometers enable an objective measurement of physical activity leve...

Please sign up or login with your details

Forgot password? Click here to reset