Identifying Genetic Risk Factors via Sparse Group Lasso with Group Graph Structure

09/12/2017
by   Tao Yang, et al.
0

Genome-wide association studies (GWA studies or GWAS) investigate the relationships between genetic variants such as single-nucleotide polymorphisms (SNPs) and individual traits. Recently, incorporating biological priors together with machine learning methods in GWA studies has attracted increasing attention. However, in real-world, nucleotide-level bio-priors have not been well-studied to date. Alternatively, studies at gene-level, for example, protein--protein interactions and pathways, are more rigorous and legitimate, and it is potentially beneficial to utilize such gene-level priors in GWAS. In this paper, we proposed a novel two-level structured sparse model, called Sparse Group Lasso with Group-level Graph structure (SGLGG), for GWAS. It can be considered as a sparse group Lasso along with a group-level graph Lasso. Essentially, SGLGG penalizes the nucleotide-level sparsity as well as takes advantages of gene-level priors (both gene groups and networks), to identifying phenotype-associated risk SNPs. We employ the alternating direction method of multipliers algorithm to optimize the proposed model. Our experiments on the Alzheimer's Disease Neuroimaging Initiative whole genome sequence data and neuroimage data demonstrate the effectiveness of SGLGG. As a regression model, it is competitive to the state-of-the-arts sparse models; as a variable selection method, SGLGG is promising for identifying Alzheimer's disease-related risk SNPs.

READ FULL TEXT

page 9

page 14

page 17

page 18

research
02/23/2018

Variable selection via Group LASSO Approach : Application to the Cox Regression and frailty model

In the analysis of survival outcome supplemented with both clinical info...
research
07/28/2018

Group-sparse SVD Models and Their Applications in Biological Data

Sparse Singular Value Decomposition (SVD) models have been proposed for ...
research
04/27/2017

Large-scale Feature Selection of Risk Genetic Factors for Alzheimer's Disease via Distributed Group Lasso Regression

Genome-wide association studies (GWAS) have achieved great success in th...
research
09/07/2018

Logistic Regression Augmented Community Detection for Network Data with Application in Identifying Autism-Related Gene Pathways

When searching for gene pathways leading to specific disease outcomes, a...
research
12/31/2020

Inference post Selection of Group-sparse Regression Models

Conditional inference provides a rigorous approach to counter bias when ...
research
01/04/2023

l_1-2 GLasso: L_1-2 Regularized Multi-task Graphical Lasso for Joint Estimation of eQTL Mapping and Gene Network

A critical problem in genetics is to discover how gene expression is reg...
research
11/11/2017

A Sparse Graph-Structured Lasso Mixed Model for Genetic Association with Confounding Correction

While linear mixed model (LMM) has shown a competitive performance in co...

Please sign up or login with your details

Forgot password? Click here to reset