A powerful test for differentially expressed gene pathways via graph-informed structural equation modeling

05/17/2021
by   Jin Jin, et al.
0

A major task in genetic studies is to identify genes related to human diseases and traits to understand functional characteristics of genetic mutations and enhance patient diagnosis. Besides marginal analyses of individual genes, identification of gene pathways, i.e., a set of genes with known interactions that collectively contribute to specific biological functions, can provide more biologically meaningful results. Such gene pathway analysis can be formulated into a high-dimensional two-sample testing problem. Due to the typically limited sample size of gene expression datasets, most existing two-sample tests may have compromised powers because they ignore or only inefficiently incorporate the auxiliary pathway information on gene interactions. We propose T2-DAG, a Hotelling's T^2-type test for detecting differentially expressed gene pathways, which efficiently leverages the auxiliary pathway information on gene interactions through a linear structural equation model. We establish the asymptotic distribution of the test statistic under pertinent assumptions. Simulation studies under various scenarios show that T2-DAG outperforms several representative existing methods with well-controlled type-I error rates and substantially improved powers, even with incomplete or inaccurate pathway information or unadjusted confounding effects. We also illustrate the performance of T2-DAG in an application to detect differentially expressed KEGG pathways between different stages of lung cancer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2020

Gene-gene interaction analysis incorporating network information via a structured Bayesian approach

Increasing evidence has shown that gene-gene interactions have important...
research
04/01/2019

Gene-based Association Analysis for Bivariate Time-to-event Data through Functional Regression with Copula Models

Several gene-based association tests for time-to-event traits have been ...
research
09/30/2021

Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing

Power-enhanced tests with high-dimensional data have received growing at...
research
05/07/2021

SEAGLE: A Scalable Exact Algorithm for Large-Scale Set-Based GxE Tests in Biobank Data

The explosion of biobank data offers immediate opportunities for gene-en...
research
12/29/2018

Hypothesis testing procedures for two sample means with applications to gene expression data

In Bioinformatics, the number of available variables for a few tens of s...
research
07/05/2020

Adverse event enrichment tests using VAERS

Vaccination safety is critical for individual and public health. Many ex...
research
11/15/2010

Characterization of differentially expressed genes using high-dimensional co-expression networks

We present a technique to characterize differentially expressed genes in...

Please sign up or login with your details

Forgot password? Click here to reset