Accounting for data heterogeneity in integrative analysis and prediction methods: An application to Chronic Obstructive Pulmonary Disease

11/12/2021
by   J. Butts, et al.
0

Epidemiologic and genetic studies in chronic obstructive pulmonary disease (COPD) and many complex diseases suggest subgroup disparities (e.g., by sex). We consider this problem from the standpoint of integrative analysis where we combine information from different views (e.g., genomics, proteomics, clinical data). Existing integrative analysis methods ignore the heterogeneity in subgroups, and stacking the views and accounting for subgroup heterogeneity does not model the association among the views. To address analytical challenges in the problem of our interest, we propose a statistical approach for joint association and prediction that leverages the strengths in each view to identify molecular signatures that are shared by and specific to males and females and that contribute to the variation in COPD, measured by airway wall thickness. HIP (Heterogeneity in Integration and Prediction) accounts for subgroup heterogeneity, allows for sparsity in variable selection, is applicable to multi-class and to univariate or multivariate continuous outcomes, and incorporates covariate adjustment. We develop efficient algorithms in PyTorch. Our COPD findings have identified several proteins, genes, and pathways that are common and specific to males and females, some of which have been implicated in COPD, while others could lead to new insights into sex differences in COPD mechanisms.

READ FULL TEXT
research
11/30/2022

Biomarker-guided heterogeneity analysis of genetic regulations via multivariate sparse fusion

Heterogeneity is a hallmark of many complex diseases. There are multiple...
research
02/10/2022

Describing complex disease progression using joint latent class models for multivariate longitudinal markers and clinical endpoints

Neurodegenerative diseases are characterized by numerous markers of prog...
research
12/10/2021

Association study between gene expression and multiple phenotypes in omics applications of complex diseases

Studying phenotype-gene association can uncover mechanism of diseases an...
research
11/28/2022

Robust structured heterogeneity analysis approach for high-dimensional data

Revealing relationships between genes and disease phenotypes is a critic...
research
08/31/2022

Joint Modeling of An Outcome Variable and Integrated Omic Datasets Using GLM-PO2PLS

In many studies of human diseases, multiple omic datasets are measured. ...
research
12/16/2022

Multi-Task Learning for Sparsity Pattern Heterogeneity: A Discrete Optimization Approach

We extend best-subset selection to linear Multi-Task Learning (MTL), whe...

Please sign up or login with your details

Forgot password? Click here to reset