Fully integrative data analysis of NMR metabolic fingerprints with comprehensive patient data: a case report based on the German Chronic Kidney Disease (GCKD) study

10/08/2018
by   Helena U. Zacharias, et al.
0

Omics data facilitate the gain of novel insights into the pathophysiology of diseases and, consequently, their diagnosis, treatment, and prevention. To that end, it is necessary to integrate omics data with other data types such as clinical, phenotypic, and demographic parameters of categorical or continuous nature. Here, we exemplify this data integration issue for a study on chronic kidney disease (CKD), where complex clinical and demographic parameters were assessed together with one-dimensional (1D) 1H NMR metabolic fingerprints. Routine analysis screens for associations of single metabolic features with clinical parameters, which requires confounding variables typically chosen by expert knowledge to be taken into account. This knowledge can be incomplete or unavailable. The results of this article are manifold. We introduce a framework for data integration that intrinsically adjusts for confounding variables. We give its mathematical and algorithmic foundation, provide a state-of-the-art implementation, and give several sanity checks. In particular, we show that the discovered associations remain significant after variable adjustment based on expert knowledge. In contrast, we illustrate that the discovery of associations in routine analysis can be biased by incorrect or incomplete expert knowledge in univariate screening approaches. Finally, we exemplify how our data integration approach reveals important associations between CKD comorbidities and metabolites. Moreover, we evaluate the predictive performance of the estimated models on independent validation data and contrast the results with a naive screening approach.

READ FULL TEXT
research
11/29/2018

Using permutations to assess confounding in machine learning applications for digital health

Clinical machine learning applications are often plagued with confounder...
research
09/27/2020

Association Learning Between the COVID-19 Infections and Global Demographic Characteristics Using the Class Rule Mining and Pattern Matching

Over 26 million cases have been confirmed worldwide (by 20 August 2020) ...
research
07/14/2020

Clinical connectivity map for drug repurposing: using laboratory tests to bridge drugs and diseases

Drug repurposing has attracted increasing attention from both the pharma...
research
07/31/2023

AsdKB: A Chinese Knowledge Base for the Early Screening and Diagnosis of Autism Spectrum Disorder

To easily obtain the knowledge about autism spectrum disorder and help i...
research
07/26/2023

VISPUR: Visual Aids for Identifying and Interpreting Spurious Associations in Data-Driven Decisions

Big data and machine learning tools have jointly empowered humans in mak...
research
08/28/2023

Categorical data analysis using discretization of continuous variables to investigate associations in marine ecosystems

Understanding and predicting interactions between predators and prey and...

Please sign up or login with your details

Forgot password? Click here to reset