A Kernel-Based Neural Network for High-dimensional Genetic Risk Prediction Analysis

by   Xiaoxi Shen, et al.

Risk prediction capitalizing on emerging human genome findings holds great promise for new prediction and prevention strategies. While the large amounts of genetic data generated from high-throughput technologies offer us a unique opportunity to study a deep catalog of genetic variants for risk prediction, the high-dimensionality of genetic data and complex relationships between genetic variants and disease outcomes bring tremendous challenges to risk prediction analysis. To address these rising challenges, we propose a kernel-based neural network (KNN) method. KNN inherits features from both linear mixed models (LMM) and classical neural networks and is designed for high-dimensional risk prediction analysis. To deal with datasets with millions of variants, KNN summarizes genetic data into kernel matrices and use the kernel matrices as inputs. Based on the kernel matrices, KNN builds a single-layer feedforward neural network, which makes it feasible to consider complex relationships between genetic variants and disease outcomes. The parameter estimation in KNN is based on MINQUE and we show, that under certain conditions, the average prediction error of KNN can be smaller than that of LMM. Simulation studies also confirm the results.



There are no comments yet.


page 1

page 2

page 3

page 4


Expectile Neural Networks for Genetic Data Analysis of Complex Diseases

The genetic etiologies of common diseases are highly complex and heterog...

Joint Analysis of Individual-level and Summary-level GWAS Data by Leveraging Pleiotropy

A large number of recent genome-wide association studies (GWASs) for com...

White paper: The Helix Pathogenicity Prediction Platform

In this white paper we introduce Helix, an AI based solution for missens...

Statistical Methods and Workflow for Analyzing Human Metabolomics Data

High-throughput metabolomics investigations, when conducted in large hum...

Deep neural networks with controlled variable selection for the identification of putative causal genetic variants

Deep neural networks (DNN) have been used successfully in many scientifi...

Bayesian Neural Networks for Genetic Association Studies of Complex Disease

Discovering causal genetic variants from large genetic association studi...

Locally epistatic genomic relationship matrices for genomic association, prediction and selection

As the amount and complexity of genetic information increases it is nece...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.