An Efficient Sufficient Dimension Reduction Method for Identifying Genetic Variants of Clinical Significance

01/15/2013
by   Momiao Xiong, et al.
0

Fast and cheaper next generation sequencing technologies will generate unprecedentedly massive and highly-dimensional genomic and epigenomic variation data. In the near future, a routine part of medical record will include the sequenced genomes. A fundamental question is how to efficiently extract genomic and epigenomic variants of clinical utility which will provide information for optimal wellness and interference strategies. Traditional paradigm for identifying variants of clinical validity is to test association of the variants. However, significantly associated genetic variants may or may not be usefulness for diagnosis and prognosis of diseases. Alternative to association studies for finding genetic variants of predictive utility is to systematically search variants that contain sufficient information for phenotype prediction. To achieve this, we introduce concepts of sufficient dimension reduction and coordinate hypothesis which project the original high dimensional data to very low dimensional space while preserving all information on response phenotypes. We then formulate clinically significant genetic variant discovery problem into sparse SDR problem and develop algorithms that can select significant genetic variants from up to or even ten millions of predictors with the aid of dividing SDR for whole genome into a number of subSDR problems defined for genomic regions. The sparse SDR is in turn formulated as sparse optimal scoring problem, but with penalty which can remove row vectors from the basis matrix. To speed up computation, we develop the modified alternating direction method for multipliers to solve the sparse optimal scoring problem which can easily be implemented in parallel. To illustrate its application, the proposed method is applied to simulation data and the NHLBI's Exome Sequencing Project dataset

READ FULL TEXT

page 1

page 2

page 3

research
12/03/2015

A New Statistical Framework for Genetic Pleiotropic Analysis of High Dimensional Phenotype Data

The widely used genetic pleiotropic analysis of multiple phenotypes are ...
research
05/05/2015

Trees Assembling Mann Whitney Approach for Detecting Genome-wide Joint Association among Low Marginal Effect loci

Common complex diseases are likely influenced by the interplay of hundre...
research
12/15/2020

Certifiably Optimal Sparse Sufficient Dimension Reduction

Sufficient dimension reduction (SDR) is a popular tool in regression ana...
research
12/12/2020

Sparse dimension reduction based on energy and ball statistics

As its name suggests, sufficient dimension reduction (SDR) targets to es...
research
01/28/2021

A Kernel-Based Neural Network for High-dimensional Genetic Risk Prediction Analysis

Risk prediction capitalizing on emerging human genome findings holds gre...
research
04/02/2021

White paper: The Helix Pathogenicity Prediction Platform

In this white paper we introduce Helix, an AI based solution for missens...

Please sign up or login with your details

Forgot password? Click here to reset