A Kernel-Based Neural Network for High-dimensional Genetic Risk Prediction Analysis

by   Xiaoxi Shen, et al.

Risk prediction capitalizing on emerging human genome findings holds great promise for new prediction and prevention strategies. While the large amounts of genetic data generated from high-throughput technologies offer us a unique opportunity to study a deep catalog of genetic variants for risk prediction, the high-dimensionality of genetic data and complex relationships between genetic variants and disease outcomes bring tremendous challenges to risk prediction analysis. To address these rising challenges, we propose a kernel-based neural network (KNN) method. KNN inherits features from both linear mixed models (LMM) and classical neural networks and is designed for high-dimensional risk prediction analysis. To deal with datasets with millions of variants, KNN summarizes genetic data into kernel matrices and use the kernel matrices as inputs. Based on the kernel matrices, KNN builds a single-layer feedforward neural network, which makes it feasible to consider complex relationships between genetic variants and disease outcomes. The parameter estimation in KNN is based on MINQUE and we show, that under certain conditions, the average prediction error of KNN can be smaller than that of LMM. Simulation studies also confirm the results.


page 1

page 2

page 3

page 4


Expectile Neural Networks for Genetic Data Analysis of Complex Diseases

The genetic etiologies of common diseases are highly complex and heterog...

Deep neural network improves the estimation of polygenic risk scores for breast cancer

Polygenic risk scores (PRS) estimate the genetic risk of an individual f...

Joint Analysis of Individual-level and Summary-level GWAS Data by Leveraging Pleiotropy

A large number of recent genome-wide association studies (GWASs) for com...

White paper: The Helix Pathogenicity Prediction Platform

In this white paper we introduce Helix, an AI based solution for missens...

Statistical Methods and Workflow for Analyzing Human Metabolomics Data

High-throughput metabolomics investigations, when conducted in large hum...

Improving Opioid Use Disorder Risk Modelling through Behavioral and Genetic Feature Integration

Opioids are an effective analgesic for acute and chronic pain, but also ...

An Efficient Sufficient Dimension Reduction Method for Identifying Genetic Variants of Clinical Significance

Fast and cheaper next generation sequencing technologies will generate u...

Please sign up or login with your details

Forgot password? Click here to reset