Assessing the Reproducibility of Machine-learning-based Biomarker Discovery in Parkinson's Disease

04/06/2023
by   Ali Amelia, et al.
0

Genome-Wide Association Studies (GWAS) help identify genetic variations in people with diseases such as Parkinson's disease (PD), which are less common in those without the disease. Thus, GWAS data can be used to identify genetic variations associated with the disease. Feature selection and machine learning approaches can be used to analyze GWAS data and identify potential disease biomarkers. However, GWAS studies have technical variations that affect the reproducibility of identified biomarkers, such as differences in genotyping platforms and selection criteria for individuals to be genotyped. To address this issue, we collected five GWAS datasets from the database of Genotypes and Phenotypes (dbGaP) and explored several data integration strategies. We evaluated the agreement among different strategies in terms of the Single Nucleotide Polymorphisms (SNPs) that were identified as potential PD biomarkers. Our results showed a low concordance of biomarkers discovered using different datasets or integration strategies. However, we identified fifty SNPs that were identified at least twice, which could potentially serve as novel PD biomarkers. These SNPs are indirectly linked to PD in the literature but have not been directly associated with PD before. These findings open up new potential avenues of investigation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2018

Prediction of Alzheimer's disease-associated genes by integration of GWAS summary data and expression data

Alzheimer's disease is the most common cause of dementia. It is the fift...
research
10/22/2020

Object-Attribute Biclustering for Elimination of Missing Genotypes in Ischemic Stroke Genome-Wide Data

Missing genotypes can affect the efficacy of machine learning approaches...
research
09/16/2019

Meta-analysis of Gene Expression in Neurodegenerative Diseases Reveals Patterns in GABA Synthesis and Heat Stress Pathways

Neurodegenerative diseases are characterized as the progressive loss of ...
research
11/25/2020

Large-scale machine learning-based phenotyping significantly improves genomic discovery for optic nerve head morphology

Genome-wide association studies (GWAS) require accurate cohort phenotypi...
research
07/05/2022

Ensemble feature selection with data-driven thresholding for Alzheimer's disease biomarker discovery

Healthcare datasets present many challenges to both machine learning and...
research
07/28/2022

Knowledge-Driven Mechanistic Enrichment of the Preeclampsia Ignorome

Preeclampsia is a leading cause of maternal and fetal morbidity and mort...
research
07/06/2023

How word semantics and phonology affect handwriting of Alzheimer's patients: a machine learning based analysis

Using kinematic properties of handwriting to support the diagnosis of ne...

Please sign up or login with your details

Forgot password? Click here to reset